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PROMOTERS OF FILAMENTOUS FUNGI AND USE THEREOF 

This invention relates to expression and expression 
followed by secretion of proteins from filamentous fungi. 

BACKGROUND OF THE INVENTION 

One goal of recombinant DNA technology is the insertion 
of DNA segments which encode commercially or scientifically 
valuable proteins into a host cell which is readily and 
economically available. Genes selected for insertion are 
normally those which encode proteins produced in only limited 
amounts by their natural hosts or those which are indigenous to 
hosts too costly to maintain. Transfer of the genetic 
information in a controlled manner to a host which is capable of 
producing the protein in either greater yield or more 
economically in a similar yield provides a more desirable 
vehicle for protein production. 

Genes encoding proteins contain promoter regions of DNA 
which are essentially attached to the 5 1 terminus' of the protein 
coding region. The promoter regions contain the binding site 
for RNA polymerase II. RNA polymerase II effectively catalyses 
the assembly of the messenger RNA complementary to the 
appropriate DNA strand of the coding region. In most promoter 
regions, a nucleotide base sequence related to the sequence 
known generally as a "TATA box" is present and is generally 
disposed some distance upstream from the start of the coding 
region and is required for accurate initiation of 
transcription. Other features important or essential to the 
proper functioning and control of the coding region are also 
contained in the promoter region, upstream of the start of the 
coding region. 

Filamentous fungi, particularly the filamentous 
ascomycetes such as Aspergillus , e.g. Aspergillus niger , 
represent a class of micro-organisms suitable as recipients of 
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foreign genes coding for valuable proteins. Aspergillus niger 
and related species are currently used widely in the industrial 
production of enzymes e.g. for use in the food industry. Their 
use is based on the secretory capacity of the microorganism. 
Because they are well characterized and because of their wide 
use and acceptance, there is both industrial and scientific 
incentive to provide genetically modified and enhanced cells of 
A. niger and related species including A^ nidulans , in order to 
obtain useful proteins. 

Expression and secretion of foreign proteins from 
filamentous fungi has not yet been achieved. It is by no means 
clear that the strategies which have been successful in yeast 
would be successful in filamentous fungi such as Aspergillus. 
Evidence has shown that yeast is an unsuitable system for the 
expression of filamentous fungal genes (Pentilla et al Molec. 
Gen. Genet. (1984) 194:494-499) and that yeast genes do not 
express in. filamentous fungi. Genetic engineering techniques 
have only recently been developed for Asperg illus nidulans and 
Aspergillus niger . These techniques involve the incorporation 
of exogenously added genes into the Aspergillus genome in a form 
in which they are able to be expressed. 

To date no foreign proteins have been expressed in and 
secreted from filamentous fungi using these techniques. This 
has been due to a lack of suitable expression vectors and their 
constituent components. These components include Aspergillus 
promoter sequences described above, the region encoding the 
desired product and the associated sequences which may be added 
to direct the desired product to the extracellular medium. 

As noted, expression of the foreign gene by the host 
cell requires the presence of a promoter region situated 
upstream of the region coding for the protein. This promoter 
region is active in controlling transcription of the coding 
region with which it is associated, into messenger RNA which is 
ultimately translated into the desired protein product. 
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Proteins so produced may be categorized into two classes on the 
basis of their destiny with respect to the host. 

A first class of proteins is retained intracellular ly . 
Extraction of the desired protein, when intracellular, requires 
that the genetically engineered host be broken open or lysed in 
order to free the product for eventual purification. 
Intracellular production has several advantages. The protein 
product can be concentrated i.e. pelleted with the cellular 
mass, and if the product is labile under extracellular 
conditions or structurally unable to be secreted, this is a 
desired method of production and purification. 

A second class of proteins are those which are secreted 
from the cell. In this case, purification is effected on the 
extracellular medium rather than on the cell itself. The 
product can be extracted using methods such as affinity 
chromatography and continuous flow fermentation is possible. 
Also, certain products are more stable extracellular ly and are 
benefited by extracellular purification. Experimental evidence 
has shown that secretion of proteins in eukaryotes is almost 
always dictated by a secretion signal peptide (hereafter called 
signal peptide) which is usually located at the amino terminus 
of the protein. Signal peptides have characteristic 
distributions as described by G. Von Heijne in Eur. J. Biochem 
17-21 (1983) and are recognizable by those skilled in the art. 
The signal peptide, when recognized by the cell, directs the 
protein into the cell's secretory pathway. During secretion, 
the signal peptide is cleaved off making the protein available 
for harvesting in its mature form from the extracellular medium. 

Both classes of protein, intracellular and 
extracellular, are encoded by genes which contain a promoter 
region coupled to a coding region. Genes encoding 
extracellularly directed proteins differ from those encoding 
intracellular proteins in that, in genes encoding extracellular 
proteins, the portion of the coding region nearest to the 
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promoter (which is the first part to be transcribed by RNA 
polymerase) encodes a signal peptide. The nucleotide sequence 
encoding the signal peptide, hereafter denoted the signal 
peptide coding region or the signal sequence, is operationally 
part of the coding region per se. 

SUMMARY OF THE INVENTION 

A system has now been developed by which filamentous ; 
fungi may be transformed to express a desired protein. With 
this system, transformation can result in a filamentous fungus 
which is capable not only of expressing the protein but of 
secreting that protein as well, regardless of whether or not the • 
protein is a naturally secreted one. In addition, the level at . 
which the protein is expressed can be controlled according to . 
certain aspects of the invention. It will be appreciated by 
those skilled in the art that the system provided hereby permits 
filamentous fungi to function as valuable sources of proteins 
and provides an alternative which in many applications is 
superior to- bacterial and yeast systems. 

Thus, in a general aspect, the invention provides a 
filamentous fungus transformation system by which the genetic 
constitution of these fungus cells may be modified so as to 
alter either the nature or the amount of the proteins expressed 
by these cells. More specific aspects of the /invention are 
defined below. j 

j 

In the present invention, from one aspect, a promoter 
region associated with a coding region in filamentous fungi such 
as A. niger , A. nidulans or a related species is identified and 
isolated, appropriately joined in a functional relationship with 
a second, different coding region, outside th^ cell, and then 
re-introduced into a host filamentous fungus uWing^an 
appropriate vector. Transformed host cells express Ihe protein 
of the second coding region, under the control of the introduced 
promoter region. The second coding region may be one which is 
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foreign to the host species , in which case the host will express 
and in some cases secrete a protein not naturally expressed by 
the given host. Alternatively, the second coding region may be 
one which is natural to the host/ in which case it is associated 
with a promoter region different from the promoter region with 
which it naturally associates in the given host, to give 
modified or enhanced protein expression and secretion. 

Where the second coding region is one which encodes a 
protein which is normally secreted, it will contain a sequence 
of nucleotides at its 5 f terminus i.e. a signal peptide coding 
region, which will result, following transcription and 
translation, in the presence of a signal peptide at the amino 
terminus of the protein product. The signal peptide can be 
recognized by the fungal host and the protein product can then 
be directed into the secretory pathway of the cell and secreted. 

In another aspect, the present invention provides DNA 
sequences coding for a signal peptide i.e. a signal peptide 
• coding region, which is recognized by filamentous fungi 
preferably of the ascomycetes class e.g. A. niger and A. 
nidulans , so as to signal secretion of a protein encoded within 
the coding region. These signal peptide coding regions can be 
coupled to a coding region which encodes a protein naturally 
retained intracellularly in order to elicit secretion of that 
protein. While normally secreted proteins are encoded by coding 
regions which usually contain these signal peptide coding 
regions naturally so that incorporation of a signal peptide 
coding region is not usually necessary, the signal peptide 
coding regions of the present invention may nevertheless be 
substituted for the naturally occurring such sequence, if 
desired. Accordingly, where a signal peptide coding region is 
coupled to a region encoding a non-secreted protein, it will be 
foreign to that coding region. 

The present invention provides the ability to introduce 
foreign coding regions into filamentous fungi along with 
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promoters to arrange for the boat fungi to express different 
proteina. It alao providea the ability to regulate 
transcription of the individual genea whioh ooour naturally 
Lrein or foreign genea introduced therein, via the promoter 
region which haa been introduced into the hoat along with the 
LL. Eor example, the promoter region naturally asaocrated 
with the alcohol dehydrogenaae I (alcA) gene and the aldehyde 
d hydrogenase (aid,, gene of A. nidulans are regulatab e by 
JL. of ethanol, threonine, or other inducing » ^ 

extracellular medium. This effect is dependent on the 
of! gene Known aa alcE. «hen the ale* or aldA promoter region 
is aaaociated with a foreign protein coding region in 
« or the liKe in accordance ; wit, . the ^ 
invention, similar regulation of the expreasion o 
genea by ethanol or other inducers can be achieved. 

rs a further example, the promoter region naturally 
associated with the glucoamylase gene in auaatiUt Bi9tt and 
used in embodiments of the present invention ia positively 
induced with starch and other augars. 

in another aspect, the present invention providea a DUA 
construct which contains a promoter region in °P«ative 
association with a signal peptide coding region and «"ich 
permits introduction of a region coding for a desired protein at 
I position 3- of and in reading frame with the signal peptide 
coding region. The promoter/signal construct is suitably 
provided with a flanking reatriction site to allow preciae 
coupling of the protein coding region to the signal peptide 
coding region. 

in another aspect, the present invention provides a 
genetic vector capable of introducing the aegment carrying the 
9 p omoter and signal peptide coding region with integral protein 
coding region into the genome of a filamentous fungus host. The 
protein coding region can be either native to or foreign to the 
host filamentous fungus. 
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Thus the present invention, provides DNA sequences 
active as promoter regions and DNA sequences active as signal 
peptide coding regions in cells of filamentous fungi such as 
Aspergillus niger , Aspergillus nidulans and the like. 

The present invention thus also provides a novel 
construct comprising a DNA sequence active as a promoter region 
in cells of filamentous fungi, and a coding region chemically 
bound to said DNA sequence in operative association therewith, 
said coding region being capable of expression in a filamentous 
fungus host under influence of said DNA sequence. 

The present invention further provides a process of 
genetically modifying a filamentous fungus host cell which 
comprises introducing into the host cell, by means of a suitable 
vector, a coding region capable of expression in the transformed 
Aspergillus host cell and a promoter region active in the 
transformed Aspergillus host cell, the coding region and the 
promoter being chemically bound together and in operative 
association with one another. 

This process also encompasses the introduction of 
multiple copies of the selected construct into the host to 
provide for enhanced levels of gene expression. If necessary or 
desirable, introduction of multiple construct copies is 
accompanied by introduction of multiple copies of genes encoding 
products having a regulatory effect on the construct. 

The present invention also comprises filamentous fungal 
cells transformed by the constructs of the invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Preferred hosts according to the invention are the 
filamentous fungi of the ascomycete class, most preferably 
Aspergillus sp . including A^ niger , A. nidulans and the like. 
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In the preferred form of the invention the promoter 
region associated with either the Aspergillus niger glucoamylase 
gene or the promoter region associated with either the alcohol 
dehydrogenase I gene or aldehyde dehydrogenase genes of 
5 Aspergillus nidulans is used in preparing an appropriate vector 
plasmid. 

Either or all of these promoter regions is regulatable 
in the host cell by the addition of the appropriate inducer 
substance. In alcA and aldA , this induction is mediated by the 

10 protein product of a third gene, alcR which is controlled via 

the promoter. Evidence indicates that the availability of alcR 
product can limit the promoting function of the alcA and aldA 
promoters when multiple copies of a construct containing the 
alcA promoter or the aldA promoter are introduced into a host 

15 without corresponding introduction of multiple copies of the 

alcR gene. In such a case, the amount of alcR product which the 
host can produce may be insufficient to meet the demands of the 
several promoters requiring induction by the alcR product. 
Thus, transformation of filamentous fungal hosts by multiple 

20 copies of constructs containing the ale A or 'aldA promoter is 

accompanied by introduction of multiple copies of the alcR gene, 
according to a preferred embodiment of the present invention. 
In other instances, transcription can be repressed, for example 
by utilizing high levels of glucose, (and some other carbon 

25 sources) in the medium to be used for growth of the host. The 
expression of the product encoded by the coding region and 
controlled by the promoter is then delayed until after the end 
of the cell growth phase, when all of the glucose has been 
consumed and the gene is derepressed. The inducer may be added 

30 at this point to enhance the activity of the promoter. 

The destination of the protein product of the coding 
region which has been selected to be expressed under the control 
of the promoter described above is determined by the nucleotide 
sequence of that coding region. As mentioned, if the protein 
35 product is naturally directed to the extracellular environment, 
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it will inherently contain a secretion signal peptide coding 
region. Protein products which are normally intracellularly 
located lack this signal peptide. 

i 

Thus, for the purposes of the present disclosure it is 
to be understood that a "coding region" encodes a protein which 
is either retained intracellularly or is . seer eted. (This 
"coding region" is sometimes referred to in the art as a 
structural gene i.e. that portion of a gene which encodes a 
protein.) Where the protein is retained* within the cell that 
produces it, the coding region will usually lack a signal 
peptide coding region. Secretion of the protein encoded within 
the coding region can be a natural consequence of cell 
metabolism in which case the coding region inherently contains a 
signal peptide coding region linked naturally in translation 
reading frame with that segment of the coding region which 
encodes the secreted protein. In this case, insertion of a 
signal peptide coding region is not required. In the 
alternative, the coding region may be manipulated to introduce a 
signal peptide coding region which is foreign to that portion of 
the coding region which encodes the secreted protein. This 
foreign signal peptide coding region may be required where the 
coding region does not naturally contain a signal peptide coding 
region or it may simply replace the natural signal peptide 
coding region in order to obtain enhanced secretion of the 
desired protein with/ which the natural signal peptide is 

normally associated; 

/ 

I 

In accordance with another preferred aspect of the 
invention, therefore, a signal peptide coding region is 
provided, if required i.e. when the coding region which has been 
selected to be expressed under the control of the promoter 
described above doe$ not itself contain a signal peptide coding 
region. The signal ^epfeide coding region used is preferably 
• either one which is associated with the Aspergillus niger 
glucoamylase gene or a synthetic signal peptide coding region 
which is made in vitro and used in the preparation of an 
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appropriate vector plasmid. Most preferably, these signal 
peptide coding regions are modified at one or both termini to 
permit ligation thereof with other components of a vector. This 
ligation is effected in such a way that the signal peptide 
coding region is interposed between the promoter region and the 
protein encoding segment of the coding region such that the 
signal peptide coding region is in frame with that segment of 
the coding region which encodes the mature, functional protein. 

BRIEF REFERENCE TO THE DRAWINGS 

Figure 1A is an illustration of the base sequence of 
the DNA constituting the coding region and promoter region of 
the alcohol dehydrogenase I (alcA) gene of Aspergillus nldulans . 

Figure IB is an illustration of the base sequence of 
DNA constituting the coding region and promoter region of the 
aldehyde dehydrogenase (aldA) gene of Aspergillus nidulans . 

Figure 2 is a diagrammatic illustration of a process of 
constructing plasmid pDG6 useful in transforming a filamentous 
fungal cell; 

Figure 3 is a linear representation of a portion of the 
plasmid pDG6 of Fig. 2; 

Figure 4 is a diagrammatic illustration of the plasmid 
maps of pGLl and pGL2; 

Figure 5 is an illustration of a selection of synthetic 
linker sequences for insertion into plasmid pGL2; 

Figure 6 is an illustration of the nucleotide sequence 
of a fragment of pGL2; 

Figure 7 is an illustration of plasmid map pGL2B and 

pGL2BIFN; 
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Figure 8 is an illustration of the nucleotide sequence 
of a fragment of pGL2BIFN; 

Figure 9 illustrates plasmid pALCAlS and a method for 
its preparation; 

Figure 10 illustrates the plasmid map of pALCAlSIFN and 
a method for its preparation; 

Figure 11 represents the nucleotide sequence of a 
fragment of pALCAlSIFN; 

Figure 12 illustrates the plasmid map of pGL2CENDO; 

Figure 13 represents the nucleotide sequence of a 
fragment of pGL2CENDO; 

Figure 14 represents a plasmid map of p ALC Al S EN DO ; 

Figure 15 represents the nucleotide sequence of a 
fragment of pALCAlSENDO; 

Figure 16 illustrates plasmid pALCAlAMY and a method 
for its preparation; and 

Figure 17 represents the nucleotide sequence of a 
segment of pALCAlAMY shown in Figure 16. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

In the present invention, an appropriate promoter 
region of a functioning gene in A. niger or A. nidulans or the 
like is identified. Procedures for identifying each of the 
genes containing the desired promoter regions are similar and 
for that reason, the manner of locating and identifying the alcA 
gene and promoter therein is outlined. For this purpose, cells 
of the chosen species are induced to express the selected 
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protein e.g. alcA, and from these cells is isolated the 
messenger RNA. One portion thereof, as yet unidentified codes 
for alcA. Complementary DNA for the fragments is prepared from 
the mRNA fragments and cloned into a vector. Messenger RNA 
isolated from induced A^ nidulans is size fractionated to enrich 
for alcA sequences, end labelled and hybridized to the cDNA 
clones made from the alcA + strain. That clone containing the 
cDNA which hybridizes to alcA + mRNA contains the DNA copy of 
the alcA mRNA. This piece is hybridized to a total DNA gene 
bank from the chosen Aspergillus species, to isolate the 
selected coding region e.g. alcA and its flanking regions. The 
aldA coding region was isolated using analogous procedures. 

The coding region starts at its 5« end, with a codon 
ATC coding for methionine, in common with other coding regions 
and proteins. Where the amino acid sequence of the expressed 
protein is known, the DNA sequence of the coding region is 
readily recognizable. Immediately "upstream- of the ATG codon 
is the leader portion of the messenger RNA preceded by the 
promoter region. 

With reference to Figs. 1A and IB, these show portions 
of the total DNA sequence from A«_ nidulans, with conventional 
base notations. The portion shown in Figure 1A contains the 
promoter region and the coding region of the alcA gene which 
encodes the enzyme alcohol dehydrogenase I. The portion shown 
in the Figure IB contains the promoter region and the region 
encoding the enzyme aldehyde dehydrogenase i.e. aldA. (In both 
cases, the term "IVS" represents intervening sequences.) The 
amino acid sequences of these two enzymes is known in other 
species. From these, the regions 10 and 10 ■ are recognisable as 
the coding regions. Each coding region starts at its 5' 
("upstream") end with methionine codon ATG at 12. The 
appropriate amino acid sequences encoded by the protein coding 
region are entered below the respective rows on Figs. 1A and IB, 
in conventional abbreviations. Immediately upstream of codon 12 
is the region coding for the messenger RNA leader and the 
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promoter region r the length of which , in order to contain all 
the essential structural features enabling it to function as a 
promoter, now needs to be determined or at least estimated. 
Each of Figures 1A and IB shows a sequence of about 800 bases in 
each case, upstream from the ATG codon 12. 

It is predictable from analogy with other known 
promoters that all the functional essentials are likely to be 
contained within a sequence of about 1000 bases in length, 
probably within the 800 base sequence illustrated, and most 
likely within the first 200 - 300 base sequence, i.e. back to 
about position 14 on Figs. 1A and IB.. An essential function of 
a promoter region is to provide a site for accurate initiation 
of transcription, which is known to be a TATA box sequence. 
Such a sequence is found at 16 on the alcA promoter sequence of 
Fig. 1A, and at 16 1 on the aldA promoter sequence of Fig. IB. 
Another function of a promoter region is to provide an 
appropriate DNA sequence active in regulation of the gene 
transcription, e.g. a binding site for a regulatory molecule 
which enhances gene transcription, or for rendering the gene 
active or inactive^ Such regulator regions are within the 
promoter region illustrated in Figs. 1A and IB for the alcA and 
aldA genes, respectively. 

The precise upstream 5' terminus of the DNA sequence 
used herein as a promoter region is not critical, provided that 
it includes the essential functional sequences as described 
herein. Excess DNA sequences upstream of the 5" terminus are 
unnecessary, but unlikely to be harmful in the present invention. 

Having determined the extent of the sequence containing 
all the essential functional features to constitute a promoter 
region of the given gene, by techniques described herein, the 
next step is to cut the DNA chain at a convenient location 
downstream of the promoter region terminus and to remove the 
protein coding region, to leave basically a sequence comprising 
the promoter region and sometimes part of the region coding for 
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the messenger RNA leader. For this purpose, appropriately 
positioned restriction sites are to be located, and then the DNA 
treated with the appropriate restriction enzymes to effect 
scission. Restriction sites are recognizable from the alcA 
sequence illustrated in Figure 1A. For the upstream cutting, a 
site is chosen sufficiently far upstream to include in the 
retained portion all of the essential functional sites for the 
promoter region. As regards the downstream scission, no 
restriction site presents itself exactly at the ATG codon 12 in 
the case of alcA. The closest downstream restriction site 
thereto is the sequence GGGCCC at 13, at which the chain can be 
cut with restriction enzyme Apa I. If desired, after such 
scission, the remaining nucleotides from location 13 to location 
12 can be removed, in stepwise fashion, using an exonuclease. 
With knowledge of the number of such nucleotides to be removed, 
the exonuclease action can be appropriately stopped when the 
location 12 is passed-. By locating a similar restriction site 
downstream of the methionine codon 12 of the aldA coding region 
shown in Fig. IB, this promote region is similarly excised for 
subsequent use. In many cases., residual nucleotides on the 5 
terminus of the promoter region are not harmful to and do not 
significantly interfere with the functioning of the promoter 
region, so long as the reading frame of the base triplets is 
maintained . 

Fig. 2 of the accompanying drawings illustrates 
diagrammatically the steps in a process of preparing plasmid 
P DG6 which can be used to create Aspergillus transf ormants 
according to the present invention. On Fig. 2, 18 is a 
recombinant plasmid containing the endogluconase (cellulase) 
coding region 30 from the bacterium Cellulomonas fimi, namely a 
BamHI endoglucanase fragment from fimi in known vector 
M13MP8. It contains relevant restriction sites for EcoRI, Hind 
III and BamHI as shown as well as others not shown and not of 
consequence in the present process. Item 20 is a recombinant 
plasmid designated p5, constructed from known T&*_ coli plasmid 
PBR322 and containing an EcoRI fragment of A., nidulans 



WO 86/06097 



- 15 - 



PCT/GB86/00209 



containing the alcA promoter region prepared as described above, 
along with a small portion of the alcA coding region, including 
the start codon ATG. It has restriction sites as illustrated, 
as well as other restriction sites not used in the present 
process and so not illustrated. Plasmid p5 contains a DNA 
sequence 22, from site EcoRI (3 1 ) to site Hind III (5'), which 
is in fact a part of the sequence illustrated on Fig. 1A, upper 
row, from position 15 (the sequence GAATTC thereat constituting 
an EcoRI restriction site) to position 17. Sequence 22 in 
plasmid 20 is approximately 2 kb in length. 

The plasmids 18 and 20 are next cut with restriction 
enzymes EcoRI and Hind III, so as to excise the ale A promoter 
region and the endoglucanase coding region 30 which are ligated 
to Hind Ill-cut plasmid pUC12, to form a novel construct pDG5A 
containing these sequences on pUC12, as shown in Fig. 2. 
Plasmid pUC12 is a known, commercially available coli 
plasmid, which replicates efficiently in E. coli , so that 
abundant copies of pDG5A can be made if desired. Novel 
construct pDG5A is isolated from the other products of the 
construct preparation. Next, construct pDG5A is provided with a 
selectable marker so that subsequently obtained transf ormants of 
Aspergillus into which the construct has successfully entered 
can be selected and isolated. In the case of Arg B 
Aspergillus hosts, one can suitably use an Arg B gene from 
A, nidulans for this purpose. The Arg B gene codes for the 
enzyme ornithine transcarbamylase , and strains containing this 
gene are readily selectable and isolatable from Arg B~ strains 
by standard plating out and cultivation techniques. Arg B 
strains will not grow on a medium lacking arginine. 

To incorporate a selectable marker, in this embodiment 
of the invention as illustrated in Fig. 2, construct pDG5A may 
be ligated with the Xba I fragment 32 of plasmid pDG3 ATCC 53006 
(see U.S. patent application serial number 06/678,578 Buxton et 
al, filed December 5, 1984) which contains the Arg B + gene 
from A. nidulans using Xba I, to form novel construct pDG6, 
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which contains the endoglucanase coding region, the alcA 
promoter sequence and the Ar^B gene . plasmid P DG6 is then used 
in transformation, to prepare novel Aspergillus mutant strains 
containing an endoglucanase coding region under the control of 
alcA promoter, as described in more detail in Example 1. 

Fig. 3 shows in linear form the diagrammatic sequence 
of the functional portion of construct'. pDG6 , from the Hind III 
site 24 to the Hind III site 26. It contains the alcA promoter 
region 22, the ATG codon 12 and a small , residual portion of the 
alcA coding region downstream of the ATG codon as shown in Fig. 
ITTollowed by the cellulase coding region 30 derived from 
plasmid 18. 

Plasmid P DG6 is but one example of a vector which 
contains a filamentous fungal promoter linked to a protein 
coding region foreign to the fungus. In another vector 
exemplified herein with reference to Figure 16 and referred to 
herein as pALCAlAMY , the filamentous fungal promoter of the alcA 
gene is coupled with the naturally occur ing sequence coding for 
the ©^-amylase enzyme, a product which is foreign to the 
transformed fungal host. The protein products of vectors P DG6 
and pALCAlAMY are expressed by the respective transformed hosts 
in both cases. Further, because the ^-amylase coding region 
naturally contains a signal peptide coding region, this product 
can be secreted by ,the transformed host, using the secretory 
machinery of the ho'st, despite its foreign relationship with the 
host. / 

Identifying and isolating the promoter regions of 
filamentous fungi thus allows one to manipulate the host by 
transformation with vectors containing these promoter regions 
coupled with a desired coding region. 

If the coding "r^ion of the vector requires a signal 
peptide coding region or the existing signal sequence is to be 
replaced by a different, preferably more efficient signal 
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peptide coding region, such signal peptide coding regions may be 
integrated between the promoter and that segment coding for the 
secreted protein. Plasmids pGL2 (Figure 4) and pALCAlS (Figure 
9) represent intermediate cloning vectors particularly suited 
for this purpose. Each can function as a cassette, providing a 
promoter, a signal sequence and a restriction site downstream of 
the signal sequence which permits insertion of a protein coding 
region in proper, transcriptional reading frame with the signal 
sequence . 

Plasmid pGL2 shown in Figure 4 is created from pGLl 
which contains the promoter 40, the signal sequence 42 and an 
initial portion 46 of the glucoamylase gene, all of which were 
derived in one segment from A. niger DNA according to methods 
exemplified herein. In this segment, a BssHII restriction site 
is available toward the end of, but nevertheless within the 
glucoamylase signal sequence 42, the nucleotide sequence of 
which is reproduced below in chart 1. 

Chart 1 

5« ATG TCG TTC CGA TCT CTA CTC GCC CTG AGC GGC CTC GTC TGC 
met ser phe arg ser leu leu ala leu ser gly ley val cys 

BssK II 

ACA GGG TTG GCA AAT GTG ATT TCC AAG CGC 3' 
thr gly leu ala asn val ile ser lys arg 

In order to provide a segment downstream of the signal 
sequence i.e. a linker 44, capable of receiving a protein coding 
region in reading frame with the signal sequence 42, advantage 
is taken of the presence of the BssH II site within the signal 
sequence and the Sst I site downstream thereof. In this 
specific embodiment, segment 46 is excised from pGLl and 
replaced with a selected one of three linkers shown in Figure 5 
and denoted A, B or C. Each linker is able to ligate with the 
BssH II end and the Sst I end. The linkers are also engineered 
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so as to restore the terminal codons of the signal sequence lost 
upon excision of segment 46 with BssH II. Further, each linker 
defines unique EcoRV and Bgl II/ Xho II sites within its 
nucleotide sequence so as to permit insertion of the desired 
coding region into the vector pGL2. 

Selection of the appropriate linker is made with 
knowledge of at least the first few codons of the protein coding 
region to be inserted into the linker. In order for the protein 
coding region to be translated sensibly, the start of the 
protein coding region must be either directly coupled with or be 
a specific number of nucleotides i.e. in triplets, from the 
start of the signal sequence. Accordingly, if the protein 
coding region to be inserted possesses one or two unessential 
nucleotides (or a non-triplet factor thereof) at its 5' region 
as may result from routine excision, one of the three linkers 
shown in Figure 5 can compensate for the presence of the extra, 
superfluous nucleotides and locate the start of the protein 
coding region in translational reading frame with the signal 
sequence. 

The amino acid residues encoded by the linkers A, B and 
C appear under their nucleotide sequences as shown in Figure 5, 
from which the effect of adding an additional nucleotide to the 
linker sequence on the reading frame of the linker and 
ultimately on the inserted protein encoding region may be 
noted. By designing the linkers such that the restriction site 
is always downstream of the reading frame modification i.e. one, 
two or three adenine residues in linkers A, B or C respectively, 
the reading frame of the coding region inserted into the 
restriction site can be maintained by appropriate linker 
selection. 

Exemplary of plasmids which employ plasmid pGL2 and 
specific linker segments A, B, or C are pGL2B!FN which employs 
the B linker and results in interferon secretion when used 

in filamentous fungus e.g. Aspergillus sp . , transformation and 
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pGL2CEND0 which employs the C linker and results in 
endoglucanase secretion when such filamentous fungi are 
transformed therewith. 

While plasmid pGL2 utilizes a naturally occurring 
signal sequence, it is within the scope of the invention also to 
utilize vectors containing synthetic signal sequences. An 
example of one such vector is pALCAlS which, like plasmid pGL2, 
represents an intermediate vector within which a protein coding 
region may be inserted to form a vector capable of transforming 
filamentous fungi* Unlike pGL2 however, pALCAlS utilizes the 
alcA promoter and utilizes a synthetic signal sequence coupled 
to that promoter. pALCAlS is illustrated in Figure 9 which 
shows a scheme for preparing it and to which further reference 
is made in the examples. Exemplary of plasmids created from 
pALCAlS are pALCAlSIFN which results in secretion of 
interferon <=>C-2 from a filamentous fungus transformed therewith 
and pALCAlSENDO which results in secretion of endoglucanase from 
a filamentous fungus host. In both instances secretion is 
obtained despite the foreign nature of the secreted protein with 
respect to the host. 

The invention is further described and illustrated by 
the following specific, non-limiting examples. 

Each of Examples 1 and 2 which follow exemplify 
successful transformation of a filamentous fungal host using 
vectors having a filamentous fungus-derived promoter coupled 
with naturally occurring but non-fungal coding regions. 

Example 1 - Transformation of A. nidulans using pDG6 AT CC 53169 

The vector construct pDG6 shown in Figure 2 was first 
prepared following the process scheme illustrated in Figure 2, 
using standard routine ligation and restriction techniques. 
Then the construct pDG6 was introduced into Arg B mutant 
cells of Aspergillus nidulans as follows: 
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500 mis of complete media (Cove 1966) + 0.02% arginine 
+ 10" 5 % biotin in a 2 1 conical flask was innoculated with 
10 5 conidia/ml of an nidulans Arg B~ strain and incubated 
at 30°C, shaking at 250 rpm for 20 hours. The mycelia were 
harvested through Whatman No. 54 filter paper, washed with 
sterile de ionized water and sucked dry. The mycelia were added 
to 50 ml of filter sterile 1.2 M MgS0 4 10 mM potassium 
phosphate P H 5.8 in a 250 ml flask to which was added 20 mg of 
Novozym 234 (Novo Enzyme Industries), 0.1 ml (=15000 units) of 
B-glucuronidase (Sigma) and 3 mg of Bovine serum albumin for 
each gram of mycelia. Digestion was allowed to proceed at 37°C 
with gentle shaking for 50-70 minutes checking periodically for 
spheroplast production by light-microscope. 50 mis of sterile 
deionised water was added and the spheroplasts were separated 
from undigested fragments by filtering through 30 urn nylon mesh 
and harvested by centrifuging at 2500 g for 5 minutes in a swing 
• out rotor in 50 ml conical bottom tubes, at room temperature. 
The spheroplasts were washed, by resuspending and centrifuging, 
twice in 10 mis of 0.6 M KC1. The number of spheroplasts was 
determined using a hemocytometer and they were resuspended at a 
final concentration of 10 8 /ml in 1.2 M Sorbitol, 10 mM 
Tris/HCl, 10 mM CaCl 2 pH 7.5. Aliquots of 0.4 ml were placed 
in plastic tubes to which DNA pDG6 (total vol. 40 ul in 10 mM 
Tris/HCl 1 mM EDTA pH 8) was added and incubated at room 
temperature for 25 minutes. 0.4 ml, 0.4 ml then 1.6 ml aliquots 
of 60% PEG4000, 10 mM Tris/HCl, 10 mM CaCl 2 pH 7.5 were added 
to each tube sequentially with gentle, but thorough mixing 
between each addition, followed by a further incubation at room 
temperature for 20 minutes. The transformed spheroplasts were 
then added to appropriately supplemented minimal media 1% agar 
overlays, plus or minus 0.6 M KC1 at 45°C and poured immediately 
onto the identical (but cold) media in plates. After 3-5 days 
at 37°C the number of colonies growing was counted (F. Buxton et 
al) , Gene 37, 207-214 (1985)). The method of Yelton et al 
r Proc. Nat ' 1 Acad . Sci. U.S.A. 81; 1370-1374 (1980)] was also 
used. 
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The colonies were divided into two groups. Threonine 
(11.9 g/Liter) and fructose (1 g/Litre) were added to the 
incubation medium for one group to induce the cellulase gene 
incorporated therein. No inducer was added to the other group, 
which were repressed by growth on minimal media with glucose as 
sole carbon. source. Both groups were assayed for general 
protein production by BioRad Assay f following cultivation, 
filtering to separate the mycelia, freeze drying, grinding and 
protein extraction with 20 mM Tris/HCl at pH 7 . 

To test for production of cellulase, plates of Agar 
medium containing cellulase (9 g/Lt, carboxymethylcellulose) 
were prepared, and small pieces of glass fibre filter material, 
isolated from one another, and 75 ug of total protein from one 
of the transconjugants was added to each of the filters. The 
plates were incubated overnight at 37°C. The filters were then 
removed, and the plates stained with congo red to determine the 
locations where cellulase had been present in the total protein 
on the filters, as evidenced by the breakaown of cellulase in 
the agar medium below. The plates were de-stained, by washing 
with 5M NaCl in water, to detect the differences visibly. 

Of four transformants induced with threonine and 
fructose, three clearly showed the presence of cellulase in the 
total protein product. The non-induced, glucose repressed 
transformants did not show evidence of cellulase production. 

Three control transformants were also prepared from the 
same vector system and strains, but omitting the promoter 
sequence. None of them produced cellulase, with or without 
inducers. The presence of f imi endoglucanase coding region 
was verified by the. fact that medium from threonine-induced 
transformed strains showed reactivity with a monoclonal antibody 
raised against C. fimi endoglucanase. This monoclonal antibody 
\howed no cross-reactivity with endogenous A. nidulans proteins 
in control strains. 
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ExamP le 2 - Transformation of A._ nidulans using pALCAlAMY 
ATCC 53380 - — 

The vector construct pALCAlAMY was prepared as 
indicated in Figure 16, using standard routine ligation and 
restriction techniques. In particular, and with reference to 
Figure 16 vector pALCAl containing a Hind HI-EcoRI segment in 
which the A. nidulans alcohol dehydrogenase 1 promoter 22 xs 
located («; described previously) , was cut at its EcoRI sxte xn 
order to insert the coding region of the wheat ^-amylase gene 
72 contained within an EcoRI-EcoRI fragment defined on plasmxd 
P501 (see S.J. Roths te in et al. Nature, 308, 662-665 (1984)) 
L wheat oc -amylase is a naturally secreted protein, xts codxng 
region 72 contains a signal peptide coding region 76 and a 
segment 78 which encodes mature, secreted ^-amylase. Lxgatxon 
of coding region 72 contained in the EcoRI -EcoRI segment of £01 
within the EcoRI -cut site of pALCAl provides plasmxd pALCAlAMY 
in which the AlcA promoter 22 is operatively associated wxth 
the oC-amylase coding region. The correct orientation of the 
p501-derived ^-amylase coding region within pALCAlAMY is 
confirmed by sequencing across the ligation site according to 
standard procedures. The nucleotide sequence of the 
promoter/coding region junction is shown in Figure 17. 

A. nidulans may be transformed by the procedure 
described'in example 1, samples of extracellular medium being 
taken from and applied to glass fibre filter papers placed on 1% 
soluble starch agar. The filters are then removed after 8 hours 
at 37°C and inverted onto beakers containing solid iodxne (xn a 
50-C water bath) . Clear patches indicate starch degradation 
while the remaining starch turns a deep purple, thereby 
confirming the presence of secreted oC-amylase. 

in examples 3-12 which follow, vectors are provided in 
which a secretion signal peptide coding region is introduced xn 
the vector in order to obtain secretion of a foreign protexn 
from a filamentous fungus transformed by the entire vector. 
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Example 3 - Production of Plasmid pGL2 , an intermediate vector 

A) Source of promoter and signal peptide sequence 

The glucoamylase gene of A._ niger was isolated by 
probing a gene bank derived from DNA available in a strain of 
this microorganism on deposit with ATCC under catalogue number 
22343. The probing was conducted using oligonucleotide probes 
prepared with Biosearch oligonucleotide synthesis equipment and 
with knowledge of the published amino acid sequence of the 
glucoamylase protein. The amino acid sequence data was "reverse 
translated" to nucleotide sequence data and the probes 
synthesized. The particular gene bank probed was a Sau 3A 
partial digest of the A. niger DNA described above cloned into 
the Bam HI site of the commercially available plasmid pUC12 
which is both viable in and replicable in E_j_ Coli. 

A Hind III -Bgl II piece of DNA containing the 
glucoamylase gene was subcloned into pUC12. Subsequently, the 
location of the desired promoter region, signal peptide coding 
region and protein coding region of the glucoamylase was 
identified within pUC12 containing the sub-cloned fragment. The 
EcoRI/EcoRI fragment (see Figure 4) was shown to contain a long, 
open translation reading frame when it was sequenced and the 
sequence data was analyzed using the University of Wisconsin 
sequence analysis programmes. 

Results of analysis of the nucleotide sequence of part 
of the region of the glucoamylase gene between the 5' Eco RI 
site and BssH II 3' site within the Hind III - Bgl II fragment 
are shown in Figure 6. This region contains the glucoamylase 
promoter and the signal peptide coding region. 

Within this fragment i.e. at nucleotides 97-102 is a 
"TATA box" 48 which provides a site required by many eukaryotic 
promoter regions for accurate initiation of transcription 
(probably an RNA polymerase II binding site) . Accordingly, the 
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presence of at least a portion of the promoter region is 
confirmed. Further, it is predictable from analogy with other 
known promoter regions that all the functional essentials are 
likely to be contained within a sequence of about 1,000 bases in 
length and most likely within the first 200 - bases upstream of 
the start codon for the coding region i.e. nucleotides 206 - 208 
or "ATG" 49, the codon for methionine. Thus, the promoter and 
transcript leader terminate at nucleotide 205. The identity of 
the beginning of the promoter region is less crucial although 
the promoter region must contain the RNA polymerase II binding 
site and all other features required for its function. Thus, 
whereas the Eco RI-Eco Rl sequence is believed to represent the 
entire promoter region of the glucaomylase gene, the fragment 
used in plasmid pGL2 contains this fragment in the much larger 
Hind III - BamH I/Bgl II segment to ensure that the entire 
promoter region is properly included in the resultant plasmid. 

On the basis that the amino acid sequence of mature 
glucoamylase is known (see Svensson et al, "Characterization of 
two forms of glucoamylase from Aspergillus niger", Carlsberg 
Res. Commun , 47, 55-69 (1982)), a nucleotide sequence of the 
signal peptide can be determined accurately. The signal peptide 
coding region of genes encoding secreted proteins is known to 
initiate with the methionine residue encoded by the ATG codon 
49. Determination of a sufficient initial portion of the 
nucleotide sequence beyond i.e. 3' of the ATG codon provides 
information from which the amino acid sequence of that portion 
may be determined. By comparison of this amino acid sequence 
with the published amino acid sequence, the signal peptide can 
be identified as that portion of the glucoamylase gene which has 
no counterpart in the published sequence with which it was 
compared. The glucoamylase signal peptide coding region defined 
herein was previously confirmed using this method. 

By the above methods, the Hind III - Bam HI/Bgl II 
fragment resulting from Sau 3A partial digestion and 
incorporated into pUC12 was confirmed to contain the following 
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features of the glucoamylase gene: an initial, perhaps 
non-relevant section, the promoter region, the signal peptide 
coding region and the remaining portion of the coding region. 
This fragment, inserted into the pUC12 plasmid by scission with 
Hind III and Bam HI/Bgl II and ligation appears schematically in 
Figure 4 as plasmid pGLl. This plasmid contains all of the 
features necessary for replication and the like in order to 
remain selectable and replicable in E. Coli . 

B) Construction of Plasmid pGL2 

Using pGLl as a precursor, plasmid vector pGL2 can be 
formed as shown in Figure 4. The restriction site BssH II near 
the 3' end of the signal sequence 42, is utilized together with 
the unique downstream Sst I site in order to insert a synthetic 
linker sequence A, B, or C defined in Figure 5 herein. Thus, 
pGLl is cleaved with both BssH II and Sst I thereby removing the 
initial portion of the glucoamylase coding region 46 contained 
therein. Thereafter a selected one of the synthetic leader 
sequences A through C having been designed so as to be flanked 
by BssH II/Sst I compatible ends- is inserted and ligated, 
thereby generating plasmid pGL2. Depending on which of the 
three linker sequences is used i.e. A, B or C, the resultant 
plasmid will hereinafter be identified as pGL2A, pGL2B or pGL2C, 
respectively. 

The synthetic linker sequences identified herein are 
each equipped with unique Eco RV and Bgl II restriction sites, 
as shown in Figure 5, into which a desired protein coding region 
may be inserted. Once inserted, the resultant plasmid may be 
used to transform a host e.g. A niger , A. nidulans and the 
like. The presence of the promoter region and the signal 
peptide coding region both of which are recognized by the host, 
provide a means whereby expression of the protein coding region 
and secretion of the protein so expressed is made possible. 
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Example 4 - Use of Plasmid PGL2 in creating pGL2BIFN 

An example of the utility of the plasmid pGL2 is 
described below with reference to Figure 7, which shows 
schematically the construction of plasmid pGL2BIFN from pGL2B . 

The plasmid pGL2B is prepared as described in general 
previously for P GL2 save that synthetic linker sequence «B« 
shown in Figure 5 is inserted specifically. The reference 
numeral 44 has accordingly been modified in Figure 7 to read 
•-44B" in order to make available an opening in the vector 
PGL2B, the plasmid is cut with Eco RV at the site internal to 
linker 44B. The scission results in blunt ends which may be 
liaated with a fragment flanked by blunt ends using ligases 
known to be useful for this specific purpose. 

in the embodiment depicted in Figure 7, a fragment 60 
containing the coding region of human interferon oC-2 is 
inserted to create pGL2BIFN . Specifically, a Dde I - Bam HI 
fragment 60 containing the coding region coding for human 
interferon 2 was excised from plasmid P N5H8 (not shown) on 

the basis of the known sequence and restriction map of this gene. 

The plasmid pN5H8 combines known plasmid pATl53 with 
the interferon gene at a Bam EI site. The interferon gene 
therein is described by Slocomb, et. al., "High level expression 
of an interferon cC-2 gene cloned in phage Ml3mp7 and 
subsequent purification with a monoclonal antibody" Proceedings 
of the National academy, of Sciences, U.S.A., Vo. 79 PP 5455-5459 
(1982) 

in order to anneal the sticky ends of the interferon 
fragment into the cut Eco RV site of pSL2B, the sticky Dde I and 
Bam HI ends are filled using reverse transcriptase and ligated 
with an appropriate ligase according to techniques standard m 
the art. 
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The advantage of selecting linker sequence B for 
insertion into pGL2 is manifest from Figure 8 which shows the 
reading frame of the interferon K. -2 coding region and its 
relationship with the recreated signal peptide sequence, in 
terms of nucleotide sequence and amino acid sequence, where 
appropriate. 

Figure 8 shows a portion of the promoter region 40 5' 
of the signal sequence joined with a portion of the glucoamylase 
signal peptide sequence 42 beginning with the methionine codon 
ATG at 49 and ending with the lysine codon AAG at 50. In fact, 
although the signal peptide coding region extends one residue 
further i.e. to the CGC codon for arginine at 52, this latter 
residue is comprised by the synthetic linker sequence 44B 
engineered so as to compensate for the loss of the arginine 
residue during scission and ligation to insert the linker 
sequence. In this way, the genetic sequence of the signal 
remains undisturbed. 

In a similar manner, the linker sequence provides for 
insertion of the interferon ©C-2 coding region without altering 
the reading frame thereof. With reference to Figures 7 and 8 
cleavage of linker sequence 44B by Eco RV results in linker 
fragments 44B' and 44B" having blunt ends. Excision of the 
interferon 02 coding region at Dde I site results, after 
filling in of the sticky ends created by the enzyme, in the 
desired nucleotide sequence without harming the sequence of that 
coding region. Ligation within the Eco RV-cleaved linker 
sequence of the interferon sequence filled at the Dde I site 
maintains the natural reading frame of the interferon coding 
region as evidenced by the triplet codon state between the 
linker portion 44B' and the interferon coding region 60. Had 
the linker A shown in Figure 5 been chosen, which bears one less 
nucleotide than the linker B, the entire reading frame would 
have been shifted by one nucleotide resulting in a nonsense 
sequence. By selection of synthetic linker B, codons are made 
available between the signal peptide sequence and the interferon 
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coding region which do not alter the reading frame of the coding 
region, when the blunt ended IF *>2 fragment is oriented 
correctly. -The correct orientation is selected by sequencing 
clones with inserts across the ligation junction. 

Example -5 i-" Expression and Secretion from A. nidulans 
■ • Transformed with pGL2BIFN A TCC 53371 

The plasmid pGL2BIFN was cotransformed i.e. with a 
plasmid containing Arg_B + gene as described more fully in U.S. 
patent application serial No. 678,578 filed December 5, 1984 
into an Arg 'B ~ strain of A^ nidulans with a separate plasmid 
containing an arg B selectable marker. Ars_B transformants 
were selected of which 18 of 20 contained 1 - 100 copies of the 
human interf eron <-2 coding region (as detected by Southern blot 
analysis) . 

Several transformants were grown on starch medium to 
induce the glucoamylase promoter and the extracellular medium 
. was assayed for human IF -2 using the CellTech.lF oC-2 assay 
kit. 

All transformants exhibited some level of synthesis and 
secretion of assayable protein. Two controls, the host strain 
(not transformed) and one arg B + transformant with no 
detectable human IF oC-2 DNA showed no detectable synthesis of 
IF «C-2 protein. In a separate experiment, transformation of 
A. niger , rather than A. nidulans , with pGL2BlFN using, mutatis 
mutandis , the same procedure as described above, demonstrated 
the ability of A. niger to secrete IF <=C-2. 

Thus, although the promoter and signal regions of 
PGL2BIFN are derived from A^_ niger they are shown to be 
% operative in both A._ nidulans and A. niger. 

in the present invention, use may be made of promoter 
regions other than the glucoamylase promoter region. Suitable 



WO 86/06097 PCT/GB86/00209 

- 29 - 

for use are the promoter regions of the alcohol dehydrogenase I 
gene and the aldehyde dehydrogenase gene, illustrated in Figures 
1A and IB. 

Example 6 - Construction of Plasmid pALCAlS, ATCC 53368 
an intermediate vector 

For use with the present example, the alcA promoter was 
employed as comprised within an 10.3 kb plasmid pDG6 deposited 
with ATCC within host E. Coli JM83 under accession number 
53169. A plasmid map of pDG6 is shown in Figure 2 and, for ease 
of reference, in Figure 9 to which reference is now made, to 
illustrate another embodiment using the alcA promoter. 

pDG6 comprises, in its Hind III-EcoRI (first 
occur ranee) segment, the promoter region 22 of the alcA gene as 
well as a small 5 1 portion of the alcA coding region 3' of the 
start codon, ligated to the' endoglucanase coding region 30. 
pDG6 further comprises a multiple cloning site 62 downstream of 
the £ imi endoglucanase coding region 20.. 

To retrieve the ale A promoter region 22, pDG6 was cut 
with Pst I and Xho I removing the bulk of the endoglucanase 
coding region 30. In a second step, the linearized plasmid 64 
was resected in one direction in a controlled manner with 
exonuclease III (which will resect from Xhol but not Pstl-cut 
DNA ends) followed by tailoring with nuclease SI. The resection 
was timed so that the enzyme removed nucleotides to a position 
50 bases 5' of the alcA ATG codon, leaving the TATA box and 
messenger RNA start site intact. 

Following resection, the vector 66 was religated 
(recircular ized) creating vector 68 bearing Sal I-Xba I 
restriction sites immediately downstream of the promoter region 
22. Cleavage of vector 68 with Sal I/Xba I permits introduction 
of a signal peptide coding region at an appropriate location 
within the vector. 
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The particular signal peptide coding region employed in 
the present example was synthesized to reproduce a 
characteristic signal peptide coding region identified according 
to standard procedures as described by G. Von Heijne in Eur. J. 
Biochem . 17-21, (1933). The synthetic signal was engineered so 
as to provide a 5' flanking sequence complementary to a Sal I 
cleavage site and a 3' flanking sequence enabling ligation with 
the Xba I restriction sequence. 

The sequence of the synthetic secretion signal 68 is 
reproduced below: 
Sal I 

TCGACATGTACCGGTTCCTCGCCGTCATCTCGGCCTTCCTCGCCACTGCCTTCGCCAAG ^ 
1 GTACATGGCCAAGGAGCGGCAGTAGAGCCGGAAGGAGCGGTGACGGAAGCGGTTC 
Me tTy r Ar gPh eLe uAl aVa 111 eSer Al aPh eLe uAl aTh r Al aPheAl aLy s 

Xba I 
T 

60 64 

AGATC 

SerArg 

The secretion signal per se begins with Met and ends 
with the fourth occurrence of Ala, as indicated by the arrow. 

Once generated, the synthetic sequence 68 acting as 
signal is cloned into the Sal I-Xba I site of vector 70 
resulting in plasmid pALCAlS which contains alcA promoter region 
22, and synthetic peptide signal coding region 6 8. That the 
signal peptide coding region is inserted upstream of the 
multiple cloning site 62 is significant in that the site 62 
allows for cloning of a variety of protein coding segments 
within this plasmid. 

Accordingly, pALCAlS constitutes a valuaole embodiment 
of the present invention. 
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Example 7 - Construction of Plasmid pALCAlSIFN 

As an example of the utility of pALCAlS, reference is 
made to Figure 10 showing creation of pALCAlSIFN. This plasmid 
comprises the promoter region 22 of the alcA gene and the 
synthetic signal peptide coding region 68 both of which are 
derived from pALCAlS (Figure 9). In addition, it contains the 
coding region 60 coding for human interferon oC -2 derived from 
PGL2BIFN. 

To obtain the protein encoding segment, pGL2BIFN is 
cleaved with Eco RI and partially cleaved with Bgl II (because 
of the presence of internal Bgl II sites) . Insertion of the 
protein coding region is accomplished by cleaving pALCAlS with 
Bam HI and Eco RI both of which are available in the multiple 
cloning site 62 and ligating this coding region therein, thereby 
creating pALCAlSIFN. 

The nucleotide sequence of the resultant plasmid, from 
a site 1170 nucleotides downstream of Hind III to Eco RI is 
shown in Figure 11, indicating the relevant sites of restriction 
endonuclease digestion. It will be noted from sheet 3 of Figure 
11 that the IF o<>2 coding region 60 is in proper reading frame 
with the synthetic signal peptide coding region 68. 

Example 8 - Expression and Secretion from A. Nidulans 
Transformed with Plasmid ALCA1SIFN 

The plasmid pALCAlSIFN prepared as described above was 
co-transformed with A. nidulans to provide an arg B selectable 
marker, the arg B+ transformants selected and checked for the 
presence of the human interferon <<.-2 coding region, then grown 
on a threonine-containing medium to induce the ale A promoter, 
all as described in example 3 above. The extracellular medium 
was assayed for human IF-2 using Cell Tech IF°C2 assay kit. 
Eleven of twenty transformants showed secretion of interferon, 
induced in the presence of threonine, and repressed in the 
presence of glucose. 
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Example 9 - pGL2CENDO ATCC 53372 

in accordance with the procedures described in the 
previous examples, there was constructed a vector plasmid 
designated pGL2CENDO , from plasmid pGL2C ATCC 53367, analagous 
to PGL2BIFN shown in Fig. 7, but containing the endoglucanase 
coding region in place of the interferon 2 coding region, and 
using the synthetic linker sequence "C" (Fig. 5) in place of 
linker sequence "B". A Bam HI fragment containing the fimi 
endogluconase coding region 30 was inserted into the Bgl II site 
of pGL2C . A. nidulans transformants were prepared with this 
vector plasmid, and showed starch regulated secretion of 
cellulase assayed as described in Example 1. The map of vector 
plasmid PGL2CENDO is shown in Fig. 12 of the accompanying 
drawings, in which 30 denotes the endoglucanase coding region 
(the endoglucanase coding region of Cellulomonas fimi, described 
in connection with Fig. 2 and Example 1), 42 denotes the signal 
peptide coding region of the glucoamylase gene and 40 denotes 
the promoter region of the glucoamylase gene. The nucleotide 
sequence is shown in Figure 13 and exemplifies that use of 
linker sequence C (Fig. 5) retains the reading frame of the 
signal peptide coding region 42 and the endoglucanase coding 
region 30. 

Example 10 - instruction o f Plasmid pALCAlSENDO ATCC 53370 

In accordance with the procedures described in the 
previous examples, there was constructed a vector plasmid 
designated pALCAlSENDO by combining Eco RI - linearized plasmid 
pALCAlS as described in example 5 (Fig. 9) with an Eco RI 
fragment derived from plasmid pDG5B (see Fig. 2) ( P DG5 with the 
orientation of the Hind III fragment reversed in pUCl2) and 
containing the endoglucanase coding region 30. The map of 
pALCAlSENDO is shown in Figure 14 and the nucleotide sequence of 
its pertinent region is shown in Figure 15. In these figures, 
the promoter region derived from alcA is designated by numeral 
22, the synthetic signal peptide coding region is designated 68 
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and the endoglucanase coding region is designated by reference 
numeral 30. 

Example 11 - Expression and Secretion from A. nidulans 
Transformed with pALCAlSENDO and pGL2CENDO 

A. nidulans was co- transformed with an argB+ selectable 
marker and the plasmid pALCAlSENDO or pGL2CEND0 prepared as 
described above. Of the co-transf ormants obtained several 
showed varying levels of secretion of cellulase (i.e. 
endoglucanase) as assayed on carboxymethylcellulose plates and 
the monoclonal antibody test systems as described in example 1. 
Both plasmid transformants showed secretion which was controlled 
by the linked promoter. Plasmid pGL2CENDO was induced by starch 
and pALCAlSENDO was induced with threonine. 

Example 12 - Expression and Secretion From A. niger 
Transformed with PGL2CENDO 

hz. niger was cotransf ormed with an argB+ selectable 
marker and the plasmid pGL2CENDO. Several of the transformants 
showed varying levels of secretion of endoglucanase as assayed 
as described in example 1. This secretion was induced by the 
presence of starch in the medium. 

Example 13 - Increased Copy Number of Re gulatory Genes 

In Aspergillus nidulans the alcA promoter is turned on 
in the presence of the appropriate inducer, such as ethanol, by 
the action of the gene product of alcR , the positive regulatory 
gene for alcA . 

Evidence with multiple copy transformants (containing 
multiple alcA promoters) suggests that the alcR gene product 
limits the promoter function of the several alcA promoters 
requiring stimulation. 
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increasing the copy number of the alcR gene increases 
the expression of alcR and relieves this situation. The 
evidence for this is as follows: 

Transformants with multiple copies of the alcA promoter 
fused to its own coding region (ADH I) in a multiple alcR 
background (which has been shown to overproduce alcR messenger 
RNA ) do not grow well on ethanol. This is probably due to rapid 
accumulation of aldehydes, the product of ADH breakdown of 
ethanol. ADH activity in these strains is high. The increased 
activity of ADH due to increased copy number probably accounts 
for these observations. 

Transformants with multiple copies of the alcA promoter 
fused to interferon oc -2 in a multiple alcR background produce 
significantly higher levels of secreted interferon. In these 
strains, unlike those with single copy alcR, many more of the 
alcA promoters have access to the alcR regulatory protein. 

Thus, preferred embodiments of the present invention 
provide means for' introducing a coding region into a filamentous 
fungus host which, when transformed, will secrete the desired 
protein. Particularly useful intermediate plasmids for this 
purpose are pALCAlS and pGL2 (A, B or C) . 

Useful transformation vectors created from these 
plasmids include pALCAlSIFN, pGL2BIFN , pALCAlSENDO and 
PGL2CEND0. Cultures of each of these and other plasmids 
mentioned herein are currently maintained in a permanently 
viable state at the laboratories of Allelix Inc., 6850 Goreway 
Drive, Mississauga, Ontario, Canada. The plasmids will be 
maintained in this condition throughout the pendency of this 
patent application and, during that time, will be made available 
to authorized persons. After issue of a patent on this 
application, these plasmids will be available from the ATCC 
depository recognized under the Budapest Treaty, without 
restriction. The accession numbers of the respective deposits 
appear in the table below: 



WO 86/06097 



PCT/GB86/00209 



- 35 - 



Plasmid 

pDG6 

pGL2A 

pGL2B 

pGL2C 

pALCAlS 

pALCAlSENDO 

pALCAlSIFN 

PGL2B1FN 

pGL2CENDO 

pALCAlAMY 



Host 

E. Coli JM83 

n 
n 
n 
it 
n 
« 
n 
n 
ii 



Accession * Deposit Date 



53169 
53365 
53366 
53367 
53368 
53370 
53369 
53371 
53372 
53380 



June 7, 1985 
Dec. 16, 1985 



n 
n 

M 

n 
it 



Dec. 20, 1985 
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CLAIMS : 

1. a DNA construct comprising a promoter operative in a 
filamentous fungus to promote transcription of a coding region 
when operatively associated therewith. 

2. The construct according to claim 1 wherein said 
promoter is native to a filamentous fungus gene, 

3. The construct according to claim 2 wherein said 
promoter is native to an Aspergillus sp gene. 

4. The construct according to claim 3 wherein said 
promoter is native to an Aspergillus nidulans gene. 

5. The construct according to claim 4 wherein said 
promoter is the promoter of the alcohol dehydrogenase I gene of 
Aspergillus nidulans . 

6. The construct according to claim 4 wherein said 
promoter is the promoter of the aldehyde dehydrogenase gene of 
Aspergillus nidulans . 

7. The construct according to claim 3 wherein said 
promoter is native to an Aspergillus niger gene. 

8. The construct according to claim 7 wherein said 
promoter is the promoter of the glucoamylase gene of Aspergillus 
niger . 

9. A DNA sequence operative in a filamentous fungus to 
signal secretion of a protein encoded by a protein coding region 
when said protein coding region is in proper translational 
reading frame with said DNA sequence, said DNA sequence having 
the nucleotide sequence substantially as defined below: 
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5 1 GTCGACATGTACCGGTTCCTCGCCGTCATCTCGGCCTTCCT 
+ + + +- 

CGCCACTGCC 3' 
+- 

10. A DNA construct comprising a promoter operative in a 
filamentous fungus in operative association with a coding region 
encoding a protein. 

11. The construct of claim 10 wherein the coding region is 
foreign to the promoter. 

12. The construct of claim 11 wherein the promoter is 
native to a filamentous fungus. 

13. The construct of claim 11 wherein the promoter is 
native to a filamentous fungus of Aspergillus sp . 

14. The construct of claim 11 wherein the promoter is 
native to Aspergillus nidulans or Aspergillus niger . 

15. The construct of claim 11 wherein the promoter is 
selected from the promoter of the alcohol dehydrogenase I gene 
of A^ nidulans , the promoter of the aldehyde dehydrogenase gene 
of A^ nidulans and the promoter of the glucoamylase gene of 

A r niger . 

16. The construct according to claim 10 which comprises the 
promoter of the alcohol dehydrogenase I gene of A^ nidulans in 
operative association with the endoglucanase coding region of 

C. fimi . 

17. The construct according to claim 16 as contained on 
plasmid pDG6 having ATCC catalogue number 5 316 9. 

18. The DNA expression vector according to claim 11 wherein 
the coding region naturally contains an integral signal peptide 
coding region. 
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19. The construct of claim 18 wherein the promoter is 
native to a filamentous fungus. 

20. The construct of claim 18 wherein the promoter is 
native to a filamentous fungus of Aspergillus sp_. 

21. The construct of claim 18 wherein the promoter is 
native to Aspergillus nidulans or Aspergillus niger. 

22. The construct of claim 18 wherein the promoter is 
selected from the promoter of the alcohol dehydrogenase I gene . 
of A._ nidulans , the promoter of the aldehyde dehydrogenase gene 
of A. nidulans and the promoter of the glucoamylase gene of 

A. niger . 

23. The construct according to claim 18 which comprises the 
promoter of the alcohol dehydrogenase I gene of Aspergillus 
nidulans in operative association with the wheat c<-amylase 
coding contained on plasmid p501. 

24. The construct according to claim 23 as contained on 
plasmid pALCAlAMY having ATCC catalogue number 53380. 

25. A DNA construct comprising, in a 5' to 3' direction: 

a) a promoter operative in a filamentous fungus; 

b) a signal peptide coding region in operative 
association with the promoter, said signal peptide coding region 
encoding a polypeptide which functions in a filamentous fungus 
to signal secretion of a protein fused functionally thereto; and 

c) a segment of DNA linked with said signal peptide 
coding region and defining a restriction site within which a 
protein coding region may be inserted such that said protein 
coding region is in proper translational reading frame with said 
signal peptide coding region. 
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26. The construct according to claim 25 wherein said 
promoter is native to a filamentous fungus. 

27. The construct according to claim 26 wherein said signal 
peptide coding region is native to a filamentous fungus. 

28. The construct according to claim 27 wherein said signal 
peptide coding region is native to a gene coding for a protein 
secreted from Aspergillus niger or Aspergillus nidulans. 

29. The construct according to claim 26 wherein said signal 
peptide coding region is the signal peptide coding region of the 
glucoamylase gene of Aspergillus niger . 

30. The construct according to claim 29 wherein the 
promoter is the promoter of the glucoamylase gene of Aspergillus 
niger . 

31. The construct according to claim 30 as contained on a 
plasmid selected from pGL2A having ATCC catalogue number 5336 5, 
pGL2B having ATCC catalogue number 53366 and pGL2C having ATCC 
catalogue number 53367. 

32. The construct according to claim 26 wherein said signal 
peptide coding region has a nucleotide sequence characteristic 
of signal peptide coding regions operative in a filamentous 
fungus . 

33. The construct according to claim 32 wherein said signal 
peptide coding region and said segment of DNA have a nucleotide 
sequence substantially as follows: 



5 ' GTCGACATGTACCGGTTCCTCGCCGTCATCTCGGCCTTCCT 
+ + + +- 

CGCCACTGCC 3' 
+_ 
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34. The construct according to claim 33 wherein the 
promoter is the promoter of the alcohol dehydrogenase I gene of 
Aspergillus nidulans . 

35. The construct according to claim 34 as contained on 
plasmid pALCAlS having ATCC catalogue number 53368. 

36. A DNA construct comprising, in a 5' to 3' direction: 

a) a promoter operative in a filamentous fungus; 

b) a signal peptide coding region in operative 
association with said promoter, said signal peptide coding 
region encoding a signal peptide which serves to signal 
secretion from a filamentous fungus of a protein when fused 
functionally to the protein; and 

c) a protein coding region which codes for a protein 
said protein coding region being linked in proper translational 
reading frame with said signal peptide coding region. 

37. The construct according to claim 36 wherein at least 
one of said promoter, said signal peptide coding region and said 
protein coding region is foreign to the other or others of said 
promoter, said signal peptide coding region and said protein 
coding region. 

38. The construct according to claim 37 wherein said 
protein coding region is foreign to said signal peptide coding 
region ana to said promoter. 

39. The construct according to claim 38 wherein said 
promoter and said signal peptide coding region are native to a 
filamentous fungus. 

40. The construct according to claim 38 wherein said 
promoter and signal peptide coding region are native to 
Aspergillus sp_. 
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41. The construct according to claim 40 wherein the 
promoter is selected from the promoter of the alcohol 
dehydrogenase I gene of Aspergillus nidulans , the promoter of 
the aldehyde dehydrogenase gene of Aspergillus nidulans and the 
promoter of the glucoamylase gene of Aspergillus niger . 

42. The construct according to claim 40 wherein the 
promoter is the promoter of the glucoamylase gene of Aspergillus 
niger and the signal peptide coding region is the signal peptide 
coding region of said gene. 

43. The construct according to claim 42 wherein the protein 
coding region is the endoglucanase coding region of the fimi. 
endoglucanase gene. 

44. The construct according to claim 43 as contained on 
plasmid pGL2CENDO having ATCC catalogue number 53372.. 

45. The. construct according to claim 42 wherein the protein 
coding region is the interferon oC-2 coding region of the human 
interferon e*T-2 gene. 

46. The construct according to claim 45 as contained on 
plasmid pGL2BIFN having ATCC catalogue number 53371. 

47. The construct according to claim 38 wherein said signal 
peptide coding region has a nucleotide sequence characteristic 
of signal peptide coding regions operative in and native to 
filamentous fungi. 

48. The construct according to claim 47 wherein the 
promoter is native to a filamentous fungus. 

49. The construct according to claim 48 wherein the signal 
peptide coding region has a nucleotide sequence substantially as 
shown below: 
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5 • GTCGACATGTACCGGTTCCTCGCCGTCATCTCGGCCTTCCT 



CGCCACTGCC 3' 
_ +- 



50 The construct according to claim 49 wherein the 

promoter operatively associated with the signal peptide coding 
region is selected from the promoter of the alcohol 
dehydrogenase I gene of A*, nidulans, the promoter of the 
aldehyde dehydrogenase gene of A^ nidulans and the promoter of 
the glucoamylase gene of A^_ niger . 

51 The construct according to claim 49 comprising the 
promoter of the alcohol dehydrogenase I gene of Aspergillus 
nidulans . 

52 The construct according to claim 51 wherein the protein 
coding region linked with said signal peptide coding region is 
the endoglucanase coding region of the fimi endoglucanase 
gene • 

53. The construct according to claim 52 as contained on the 

plasmid pALCAlSENDO having ATCC catalogue number 53370. 

54 The construct according to claim 51 wherein the protein 

coding region linked with the signal peptide coding region is 
the interferon ^ -2 coding region of the human interferon « -2 
gene . 

55. The construct according to claim 54 as contained on the 

plasmid pALCAlSIFN having ATCC catalogue number 53369. 

56 A DNA expression vector with which a host filamentous 

fungus may be transformed when said vector is introduced into 
said host, the expression vector comprising a DNA construct as 
definea in any one of claims 10-24 and 36-55. 
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57. The DNA expression vector according to claim 56 wherein 
the host which may be transformed therewith is of Aspergillus s£l 

58. The DNA expression vector according to claim 56 wherein 
the host which may be transformed therewith is Aspergillus 
nidulans . 

59. The DNA expression vector according to claim 56 wherein 
the host which may be transformed therewith is Aspergillus niger. 

60. The DNA expression vector according to claim 57 in 
plasmid form. 

61. The DNA expression vector according to claim 57 which 
is plasmid pDG6 ATCC 53169. 

62. The DNA expression vector according to claim 57 which 
is plasmid pALCAlAMY ATCC 53380. 

63. The DNA expression vector according to claim 57 which 
is plasmid pALCAlSENDO ATCC 53370. 

64. The DNA expression vector according to claim 57 which 
is plasmid pALCAlSIFN ATCC 53369. 

65. The DNA expression vector according to claim 57 which 
is plasmid pGL2BIFN ATCC 53371. j 

I 

66. The DNA expression vector according to cl|aim 57 which 
is plasmid pGL2CENDO ATCC 5337 2. 

67. A filamentous fungal cell having the capacity to 
express proteins foreign thereto. \ 

68. A filamentous fungus cell transformed by a \ DNA \ 
construct as defined in any one of claims 10-24 and 36-5 5.. 
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69. Transformants according to claim 68 which are of the 
species As pergillus sp_. 

70. Transformants according to claim 68 which are of the 
species As pergillus nidulans. 

71. Transformants according to claim 68 which are of the 
species As pergillus niger . 

72. Transformants according to claim 68 containing multiple 
copies of said constructs. 

73 Transformants according to claim 72 wherein the 
transformants express greater than normal levels of regulatory 
gene products. 

74 Transformants according to claim 73 wherein the greater 
than normal levels of regulatory gene products results from 
multiple copies of genes which regulate the transcription 
promoting function of a promoter comprised by said construct. 

75 Transformants according to claim 74 containing multiple 
copies of a construct comprising the promoter of the alcohol 
dehydrogenase I gene and multiple. copies of the alcR gene. 

7 6 a process for expressing a protein in a filamentous 

fungus cell as defined in claims 68-75 which comprises cultunng 
the fungus cells under appropriate growth conditions. 

77 A process as defined in claim 76 wherein the fungus is 

cultured under conditions in which the presence of substances 
which induce or repress the transcription promoting function of 
the promoter is controlled. 

78. A filamentous fungal cell having the capacity to 

secrete proteins foreign thereto. 
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79. A filamentous fungal cell transformed by a DNA 
construct as defined in any one of claims 18-24 and 36-55. 

80. Transformants according to claim 79 which are of the 
species Aspergillus sp. 

81. Transformants according to claim 79 which are of the 
species Aspergillus nidulans . 

82. Transformants according to claim 79 which are of the 
species Aspergillus niger . 

83. Transformants according to claim 79 containing multiple 
copies of said constructs. 

84. Transformants according to claim 83 wherein the 
transformants express greater than normal levels of regulatory 
gene products. 

8 5. Transformants according to claim 84 wherein the greater 

than normal levels of regulatory gene produts results from 
multiple copies of genes which regulate the transcription 
promoting function of a promoter comprised by said construct. 

86. Transformants according to claim 85 containing multiple 
copies of a construct comprising the promoter of the alcohol 
dehydrogenase I gene and multiple copies of the alcR gene. 

87. A process for obtaining expression of and secretion of 
a protein in a filamentous fungus cell as defined in claims 
79-86 which comprises culturing the fungus under appropriate 
growth conditions. 

88. The process according to claim 87 wherein the fungus 
cell is cultured under conditions in which the presence of 
substances which induce or repress the transcription promoting 
function of the promoter is controlled. 



WO8«/06097 PCT/GB86/00209 



GGATAC AGTTGGGC ATTTCTAGGGCTGAATGGGAAGGAGAGAGTTTTG AAATAGGCGTTC 
+ + + + + + 



CGTTCTGCTTAGGGTATTTGGGAAC AATCAATGTTCAATGTAC ATTTAATCC ACG ATTTT 
+ + + + + + 



ATAAAACGTC ATCCTTTGCCCTC CCTTCTTATTTGCC AATACC AAAAATCTTACTCC AGT 
+ + + + + + 



GGTTCGGTAATCGCAGAGTTAAATCTGGGCTCGGTGGC AGATCTGC GATCGTCC ATAACC 
+ + + + + + 



GTTC AGATGTTG ATTGGAACTGGGTGGGGTAGAC AGCTCCGAAGACCGAGTG AACGTATA 
+ + + + + + 



CCT AAGACACTTTGAC ACGGCCGGAACACTGTAAGTCCCTTCGT ATTTCTCCGCC TGTGT 
+ + + + + + 



GGAGCTACC ATCC AAT AACCCCC AGCTGAAAAAGCTGATTGTCGATAGTTGTGATAGTTC 
+ + + + + + 



CC AC TTGTCCGTCCGCATC GG CATC CGCAGCTCGGGA TAG TTCCGACCTAGG AT TGGATG 

+ + + ; + + + 

C ATGCGG AACCGC AC GkGGGGGGGGGGG AAATTG AC AC AC C AC TCCTCTC C AC GCAGCCG 
+ + + + + + 



TTC AAG AGGTACGCGTATAGAGCCGT ATAGAGCAGAG ACGG AGCACTTTCTGGTACTGTC 
+ + + + + + 



CGC ACGGGATGTCCGC ACGGAGAGCC AC AAACGAGCGGGGCCCCGTACGTGCTCTCCTAC 
+ + + + : + + 

CCC AGGATCGC ATCCTCGCAT AGCTGAACATCTATAT AAAGACCCCC AAGGTTCTC AGTC 
+ + + + + ~ h 



TCACC AAC ATC ATCA ACC AAC AATC AACAGTTCTCTACTC AGTTAATTAGAAC ACTTCC A 
+ + + + + + 

ATCCT ATC ACCTCGCCTC AAAATGTGC ATCCCCACTATGC AATGGGCCC AGGTCGCCG AG J 
+ + + + + -+ 

MetCysIleProThrMe tGlnTrp AlaGlnVal Al aGlu 

Fl G.I A sheet 1 QL0 1 
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841 

AAGGTCGGCGGCCCGCTCGTCT AC AAGC AG ATCCCCGTCCCTAAGCCCGGTCCCGACC AG i 

+ + + + + + i 

LysValGlyGlyProLeuValTyrLysGlnlleProValProLysProGly ProAspGln 

ATCCTTGTGAAGATCCGCT ACTCTGGGGTTTGCC AC ACCGACC TAC ACGCTATG ATGGGT 

+ + + + + + 

IleLeu ValLys IleArgTy r SerGly ValCysHisThr AspLeuHisAlaMetMetGly 



C AC TGGCC AATCCCCGTC AAAATGCCGCTCGTCGGTGGGCACG AAGGAGCAGG AATCGTC 

+ + + + + + 

His TrpProIleProValLysMecProLeuValGlyGlyHisGluGlyAlaGlylleVal 

.15 



SI 



GTGGC AAAGGGC G AAC TGGTCC ACG AATTCG AGATCGGCG ACCAAGCTGGC ATCAAATGG 

+ + + + + — . + 

Val AlaLysGlyGluLeuValHisGluPheGluIleGlyAspGlnAlaGlylleLysTrp 

CTTAATGGTTCCTGCGGAG AGTGCG AGTTCTGCCGCC AATCGGACGACCCCCTCTGTGC A 

+ + + + + + 

LeuAsnGlySerCysGlyGluCysGluPheCysArgGlnSer AspAspProLeuCys Ala 

CGCGCCC AGCTCTCTGGGTATACTGTTGACGGCACGTTCCAGC AGTATGCGCTCGG AAAG 

_ + + + + + + 

ArgAlaGlnLeuSerGlyTyrThrValAspGlyThrPheGlnGlnTyrAlaLeuGlyLys 

GCG AGTC ATGCGTCG AAG ATCCCTGCGGGCGTTCCGGTGGATGCCGCGGCCCC AGT ACTC 

+ + + + + + 

AlaSerHis Al a SerLy s lie Pr o AlaGly Val Pro Val Asp Ala Ala Ala Pro Val Leu 

TGTGCCGGT ATTACAGTGT ACAAGGG ATTGAAAGAGGCCGGGGTCCGGCCGGGCCAGACC 

+ + + -+ + + 

Cys AlaGly 11 eThrValTyrLysGlyLeuLysGluAlaGlyVal ArgProGlyGlnThr 

GTGGCGATCGTGGGTGCCGGTGGCGGCCTGGG ATCCCTTGCAC AGCAGTATGCG AAGGCG 

+ + + + + + 

ValAlalleValGlyAlaGlyGlyGlyLeuGlySerLeuAlaGlnGlnTyr AlaLys Ala 

ATGGGGATC AGGGTTGTCGCGGTCG ATGG GGG AG ATG AG AAGCGGGCCATGTG T GAG TCG 

+ + + + + + 

MetGlylleArgValValAlaValAspGlyGlyAspGluLysArgAlaMecCysGluSer 

CTTGG AAC AG AGGTATGTACATGTTCTCAATCTC AGGAAGCAAGC AACTGACCTGG ACAG 



LeuGlyThrGlu IVS 1 



AC AT ATGTCGACTTCACCAAATCTAAAGACCTGGTGGCCG ACGTCAGGCACGGACGCGGA 

+ + +- + + + 

ThrTyrValAspPheThrLysSerLysAspLeuValAlaAspValArgHisGlyArgGly 

1560 ! 



10 



FIG 1A sheet 2 
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1561 

TGTCTGGGTGCGCACGCGGTG ATCCTGC TTGCCGTGTC AGAG AAGCCCTTCC AGC AAGCC 

+ + + + + 

Cys LeuGlyAlaHis Ala VallleLeu Leu AlaValSerGlu LysProPheGl nG In Ala 



ACTGAATATGTGCGC TCTCGCGGGACAATTGTTGCTATTGGCTTGCCGCC AG ATGCGT AC 
+ + + + + + 

ThrGluTyrValArgSerArgGlyThr He ValAlalleGlyLeuPr oPr o Asp AlaTy r 



CTC AAGGCCCCTGTGATC AAC AC AGTTGTTCGC ATGATC ACT ATC AAGGGC AGCTACGTT 

+ + > + + + + 

LeuLysAlaProVallleAsnThrValValArgMe tlleThr IleLysGly SerTy r Val 

GGAAACCGACAGG ACGGTGTCGAGGCTCTGGACTTCTTCGCTCGCGGCCTGATC AAGGCT 
+ + + + + + 

GlyAsnArgGlnAspGlyValGluAlaLeuAspPhe Phe Al a Ar gGly Leu He Ly s Ala 



CCGTTC AAGACGGCTCCTCTGAAGGATCTGCCGAAGATTTACGAGCTTATGGGTGCGTTG 

+ + + + + + 

ProPheLysThr AlaProLeuLys AspLeuProLys IleTy rGluLeuMet 



ACTCCC ATATCCG ATGTTC AATTCTCTTTGCGCGATATATTTAGATACTAATGGCTTGCA 
+ + + + + + 

I V S 2 



GAACAAGGC AGAATCGCCGGTCGTTATGTGCTAGAGATGCCAGAATAAGCGTTTCAACGC 
+ + + + + 

GluGlnGlyArglle AlaGly Arg Ty r ValLeuGluMe tProGluEnd 



10 



CC ACGGGCTGGA ACTAC AAAC ACAATCGTC AG ATGTTTC ATGTTTATGATGTCC ATGCTT 
+ + + + + + 



G ATATCTTTGT AT AT AGTTTTC AATC AAGTGGTACAATG ATTTTGGCCTTGGTTC AACCG 
+ + + + + + 



ATTTCCCTTCC ACTTTTCCTAGCCTGATACGGATAGC ACTTGTAAGC AATCATAAACCAC 
+ + + + + + 



AG ACTGG ATAG ACTGGG AAGT ATAGGTATTATGGTAGCC AAATGCG ATGATTTG ACTTTG 
+ + + + + + 



AATC AAGGCTC AAACTAGCCCTCACCGTCTC TTTTTGCGGTTATTTTTAGTCG ATTCACG 
— + + + + + + 
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GGATCC ATTTTCC AGC ATGTCG ACC AAACTGC AAATAC AAGTGTACG AAGG ACGGCGTAT 
+ + + + + + 



AGTAACGGAAGG ACTCCG AGCC AAGC AACCG AGAATG ACGTCTC AGACTC TGCGAGTGAG 
+ + . 



GCGGGCTCCAATC AGGG AACTTCTGC ATGGTC ATC AACCCCGC ATG ATCTTCTC ATC ACG 
+ + + + + + 

CCTCTTGGTTCGTAATTTTC ATTTTTGC ATTACGGCCTCGGTTATC ATCGC AGCC TCC AC 
+ + + + + + 

CAC ATAGTCGTC AAGATAGGTCC AGAATC AGTCCGCTC TAGGGGGGT AAATCGTAAATTG 
+ + . . 



C AATTCGC ATTACGGTCTGGGTTATCG ATCGCGGGG ATCCTC AAC TTTGTTTC AGAACCA 



GGGTGCTGTAGGTTGTAGATCGTAAGTTTC ATCCTGC ATTACCCGCCTCGGTTATTATCG 



CGAGCTCTTC AACGTGTTTTC AGAATCATCTAGGCTCGTGGAGGC AGTGGGC ACCGCGGC 
+ — + + : + + + 



GAAGGGGACGGAATGC AGTTCACCTGG ACTGGCTCTTG AAG ACCAGTGGGGC ACTTCGGC 
• — + + + +-- + + 

GGGTTGCT AGCTTGCTAC ATGTAATTTCC ATGGGTAAC AGCTATCCTC AACAAGAGCGGC 
+ + + + + + 

TCCGCTTGACC TGTTCCCCTCCTTTCCCCTCTTTTGCTGCGACC ACTGGCTC AGTGCTAC 
+ +-- + + + + 

^16" 

CAAAGCCAGAGCGGT ATT ATTAAAGCTCCCTCGTCCTC CC ACCG AGCC AGCATTTCTCCG 



r 



12 



TAC TCC AACTCTCCTCTCCCAAG ATACCC ATATTTCCCGCTCACC ATGTC TGATTTGTTC 
+ + + + + + 

Me t Ser Asp Leu Phe 

ACC ACC ATCGAGACTCCGGTCATCAAATATG AGC AGC CTCTCGGCCTGTATGACGTTTTC 



ThrThrlleGluThrProVallleLysTyrGluGlnProLeuGlyLeu. 
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8Z>1 

TCGCC TCC TGATTTTTTTTGTGTTGTGTTTATTAACG ATC ATTGGGTTGTAGGTTC ATC A 

+ + + + + + 

IVS 1 . . . • Phe lie A 



AC AACGAGTTCGTG AAGGGCGTTGAGGGC AAGACCTTCC AGGTCATC AACCCCTCC AACG 
+ + + + -+ + 

snAsnGluPhe ValLysGly ValGluGlyLys Thr Phe Gin Val lie AsnPro Ser AsnG 



AGAAGGTCATC ACCTCCGTCCACGAAGCCACCG AGA AGGATGTTGATGTCGCCGTCGCTG 
+ + + + + + 

luLysVal He Thr Ser Val Hi sGlu AlaThrGlu Ly s Asp Val Asp Val Al a Val Ala A 



C TGCCCGTGC TGCC TTT GAg GGG C CATGGCGC CAGGTCACCCCCTCTG AG CGTGGC AT TT 
+ + + + + + 

la Ala Arg Al a Ala Phe GluGly ProTrpArgGln Val Thr Pro SerGluArgGly IleL 



TGATC AAC AAGCTGGCGG ATCTGATGGAGC GTGATATCG ACACCC TCGCCGCT ATCG AGT 
+ + + + + 

eu lie AsnLys Leu Ala Asp Leu Me t Glu Arg Asp He Asp Thr LeuAlaAlalleGluS 



CTCTCGAC AACGGC AAGGC TTTCACC ATGGCC AAGGTCG ATCTTGCC AACTCc ATTGGTT 
1 — + + + : — + + + 

er Leu Asp AsnGlyLys AlaPheThrMe t AlaLys Val AspLeuAlaAsnSerlleGlyC 



GCTTGCGATACTACGCTGGCTGGGCGGAC AAGATTCACGGTC AGACC ATTGAC ACC AACC 
+ + + + + + 

ysLeuArgTyr Tyr AlaGly TrpAla AspLys He Hi sGlyGlnThr He Asp Thr As nP 



CCGAGACTCTTACCTAC ACCCGCCACG AGCCCGTTGGTGTTTGCGGTC AGATC ATCCCCT 

+ + + + + + 

roGluThrLeuThrTyrThrArgHisGluProValGlyValCysGlyGlnllelleProT 

GG AAC TTCCCCCTTCTGATGTGG TCC TGGAAGATTGG AC CCGCTGTTGCCGCTGGT AAC A 
+ + + + + + 

rp AsnPheProLeuLeuMe tTrp Ser TrpLy s II eGly Pro Ala Val Ala AlaGly As nT 



CTGTTGTCCTC AAGACCGCCC AGC AGACCCCTCTCTCCGCCC TTTACGCTGCT AAGCTGA 
+ + + + + + 

hrValValLeuLysThrAlaGlnGlnThr Pro Leu Ser Ala Leu Tyr Ala AlaLys Leu I 



TC AAGG AGGCTCC At t CCCC GCTGGTG TGATC AAC GTC ATC TCTGGCTTTGGCC GT ACC G 
+ + + + + + 

leLys Glu Ala Pro Phe Pro AlaGly Val He Asn Val He SerGly PheGlyAr gThr A 

CTGGTGCTGCCATCTCCAGCC AC ATGGAC ATTGAC AAGG TTGCCTTC AC TGGCTCTACTC 

+ + + + + + 

laGlyAlaAlalleSer Ser His Met Asp lie AspLys Val AlaP he Thr Gly Ser ThrL 
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TTGTTGGACCTACC AtCCTGC AGGCCGCTGCTAAGAGC AACCTGAAGAAGGTCACTCTTG 

+ + + + + + 

euValGlyProThrlleLeuGltiAlaAlaAlaLysSer AsnLeuLysLys ValThr LeuG 

AGCTCGGTGGC AAGTCTCCC AAC ATCGTCTTTGATGATGCTGAC ATTGAC AACGCC ATTT 

+ + + + + + 

luLeuGlyGlyLysSerProAsnlleValPhe Asp Asp Ala Asp lie Asp As n Ala lie S 

CCTGGGC C AACTTTGGTATCTTCTTCAACCACGGCC AGTGCTGC TGTGCTGG ATCCCGTA 

+ + + + + + 

erTrp Ala Asn Phe Gly II e Phe Phe AsnHi s GlyGlnCysCy s Cy s AlaGlySer ArgI- 

TCCTGGTCCAGGAGGGC ATCTACGAC AAGTTCGTCGCCCGCTTC AAGGAGCGTGCCC AGA 

+ + + + + + 

leLeuValGlnGluGlylleTyrAspLysPhe ValAlaArgPheLysGluArgAlaGl nL 

AG A AC AAGGTCGG AAACCCCTT CG AGC AGG AC ACCTTCC AGGGTCCCCAGGTTTCCC AGC 

+ + + + + + 

ysAsnLysValGlyAsnProPheGluGlnAspThrPheGlnGlyProGlnValSerGlnL 

TCC AGTTCG ACCGTATC ATGGAGTAC ATC AACC ACGGC AAGAAGGCTGGTGCTACCGTCG 

+ + + + +- + 

e uGlnPheAspArglleMetGluTyrIleAsnHisGlyLysLysAlaGlyAlaThrValA 

CC ACCGGTGGTG ACCGCC ACGGC AACG AGGGTT ACTTC ATCC AGCCTACTGTCTTC AC AG 

+ + + + + + 

laThrGlyGlyAspArgHisGlyAsnGluGlyTyrPhelleGlnProThrValPheThrA 

ACGTC ACTTCCG AC ATGAAGATTGCCCAGG AGG AGATCTTCGGTCCTGTCGTCACTATCC 

+ + + + + + 

s p ValThr SerAspMetLys He AlaGlnGluGluIle PheGlyProVal ValThr II eG 

AGAAGTTCAAGGATGTGGCTGAGGCTATCAAGATCGGC AACTCGACCGACT ACGGTACGT 
4. + + + + + 



InLysPheLys AspVal AlaGluAlalleLysIleGlyAsnSerThr AspTyr 

CTATCTTTTCTGGTC TTTGCCGATATTTTGTTGCTAACAT ACGC AC AGGTCTTGCTGCTG 

+ + + + + + 

IVS 2 GlyLeuAlaAlaA 

CCGTGC AC AC AAAGAACGTC AAC ACCGCC ATTCGCGTGTCC AAC GCTCTGAAGGC TGGTA 

+ + + + •+ + 

laValHisThrLysAsnValAsnThrAlalleArgValSerAsnAlaLeuLysAlaGlyT 

CCGTCTGGATC AAC AACTAC AAC ATG ATCTCGTACC AGGCTCCCTTCGGTGGCTTC AAGC 

. . + + + + 



hrValTrpIleAsnAsnTyrAsnMetlleSerTyrGlnAlaProPheGlyGlyPheLysG 
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AGTCCGGTCTCGGCCGTGAGC TTGGCTCT TACGCTCTTGAG AAC TAC AC AC AG ATC AAGA 

+ + + + + + 

InSe rGlyLeuGlyArgGluLeuGlySerTyr AlaLeuGluAsnTyr ThrGlnlleLys T 

C GGTGC ACTACCGCCTGGGTGATGCTC TTTTCGCTTAAAGCT AATTGTATGATTTGATGA 

+ + + + + + 

hrValHisTy r ArgLeuGly AspAlaLe uPhe Al aEnd 2400 " 



10' 



AATTGCG AAT AC AAGTTGG ATATATCCTGTGTGCTACGGC ACTGGTTCAAATTGCTTCTT 
+ -i- + + + + 



GTGC AGC AACC ATGTG ACTC ATGTAAAACAT ATCAGATAACCCCGG ATACGATTTTACGA 
+ + + + + + 



TTTTTTAGATTTGC TTTTATCGTAGCGTCC ACTTATCCTCGTCCGGCCAAGC AC AAAACC 
+ — + + + + + 



TATGGCTATCTTC AGC ACGCCGCG ATCCTG AACCGT AGCTGGATTGGAAATCCG AAATC A 
+ + + + + + 



ACTGCCCCGC AGCC ACCG AC ACTCGGGCTCCGGGCAAGTCCCCGCG AAATCCCTC ACC AC 
+ + + + + + 



2700 
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/EcoRV 
1 Bgill/Xholl ( 

CGCGCAGATCTCGATATCGAGCT 
„ , , „ >^ GTCTAGAGCTATAGC "*>-Sst I 

BssH II — 23 
ArgAlaAspLeuAspIleGlu?? 



1 Bg fl^hon ^ E coRV ^ B 

CGCGCAAGATCTCGATATCGAGCT 
BSSH II GTTCTAGAGCTATAGC SstI 
ArgAlaArgSerArgTyrArgAla 



1 B gliyfrh o II y- E coRV c 

CGCGCAAAGATCTCGATATCGAGCT 

g ss ^jj_ * GTTTCTAGAGCTATAGC Sst I 

ArgAlaLysIleSerlleSerSer? 
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WO 86/06097 PCT/GB86/00209 



i EcoRl 



GAATTCAAGCT AGATGC TAAGCG ATATTGC ATGGC AATATGTGTTGATGCATGT GCTTCT 
TCCTTCAGC TTCCCC TCGTGC AG ATGAAGGTTTGGCTATAAATTGAAGTGGTTGGTCGGG 



GTTCCGTGAGGGGCTGAAGTGCt TCC TCCCTTTTAGACGCAACTG AGAGCCTGAGCTTC A 
+ + + * + + + 

^49 ■ 

TCCCC AGC ATCATTAC ACCTC AGCAATGTCGTTCCGATCTCTACTCGCCCTGAGCGGCCT 

+ + + + + + 

Me t Se rPhe ArgSe r LeuLeuAlaLeuSerGly Le 

BssHII 

CGTCTGC AC AGGGTTGGCAAATGTGATTTCC AAGCGCGCG 
uValCys ThrGlyLeu Al a AsnVal IleSe rLys ArgAla 

280 
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181 



40 



v 



49 

L 



42 
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TCCCC AGC ATCATTAC ACCTC AGC AATGTCGTTCCGATC TCTACTCGCCCTG AGCGGCCT 
AGGGGTC GT AGTAATGTGG AGTCGTTAC AGC AAGGCT AG AG ATGAGCGGGAC TCGCCGGA 

240 

. Me tSerPhe Arg Se r LeuLeu Ala Leu Se r Gly Leu 



.241 



42 

L 



50 



52 



u f 



44B' 

\ 



Ddefilled 



CGTCTGCACAGGGTTGGC AAATGTG ATTTCCAAGCGCGC AAGATCTCGATTC AGCTGCAA 

■ + + + + + + 

GCAGACGTGTCCCAACCGTTTAC AC TAAAGGTTCGCGCGTTCT AGAGCTAAGTCGACGTT 

ValCysThrGXyLeuAlaAsnVallleSerLysArgAlaArgSerArgPheSerCysLys 



301 



60 



i- 



GTCAAGCTGCTCTGTGGGCTGTGATCTGCCTCAAACCC ACAGCCTGGGTAGC AGGAGGAC 
+ + + + + + 

CAGTTCGACGAGACACCCG AC ACTAGACGGAGTTTGGGTGTCGGACCCATCGTCCTCCTG 

360 

SerSerCysSerValGlyCysAspLeuProGlnThrHisSerLeuGlySerArgArgThr 



361 



60 



CTTGATGCTC 
+ 

GAACTACG AG 
LeuMe tLeu 



44B 



1 



EcoRl 
1_ 



AAAACCGG ATCATCGAGCTC GAATTC 



TTTTGGCCTAGTAGCTCGAG CTTAAG 

1206 
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1 5" ft* 



1200 

GGATAC AGTTGGGC ATTTCTAGGGCTGAA 
_ + + — + 

.CCTATGTCAACCCGTAAAGATCCCG ACTT 



TGGGAAGGAG AGAGTTTTG AAAT AGGCGTTCCGTTCTGCTTAGGGTATTTGGGAAC AATC 
+ + + ,_+- + + 

ACCCTTCCTCTCTC AAAACTTTATCCGC AAGGC AAG ACGAATCCC ATAAACCC TTGTTAG 



AATGTTC AATGTACATTTAATCC ACG ATTTTATAAAACGT?CATCCTTTGCCCTCCCTTCT 
+ + + + + + 

TTAC AAGTTAC ATGTAAATTAGGTGCTAAAAT ATTTTGCAGTAGGAAACGGGAGGGAAG A 



TATTTGCC AATACC AAAAATCTTACTCCAGTGGTTCGGTAAT CGC AGAGTTAAATCTGGG 

+ + + + + + 

ATAAACGGTTATGGTTTTTAG AATG AGGTCACC AAGCC ATTAGCGTCTC AATTTAGACCC 



CTCGGTGGC AGATCTGCGATCGTCC ATAACCGTTC AG ATGT TGATTGGAAC TGGGTGGGG 

+ + + + + + 

GAGC C ACCGTCTAGACGCTAGC AGGTATTGGCAAGTCTAC AAC TAACCTTG ACC CACCCC 



/ 

/ 



TAGACAGCTCCGAAG ACCG AGTG AACGTATACCT AAG AC ACTTTGAC ACGGCCGGAAC AC 

+ 4. + + + + 

ATCTGTCGAGGCTTCTGGCTCACTTGC ATATGGATTCTGTGAAACTGTGCCGGCC TTGTG 



i 
\ 

TGTAAGTCCC TTCGT ATTTCTiCCGG : ^3|pTGTGGAGCTACC ATCC AATAACCCCC AGCTGA 

. + + _l i _+ • — + + ----+ 
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AC ATTCAGGGAAGC ATAAAGAGGCGG ACAC ACCTCGATGGT AGGTTATTGGGGGTCG ACT 



1561 

AAAAGCTG ATTGTCGATAGTTGTGATAGTTCCCACTTGTCCGTCCGCATCGGC ATCCGCA 
+ + _._. + + + + 

TTTTCGACT AAC AGCTATC AAC ACTATCAAGGGTGAAC AGGC AGGCGT AGC CGTAGGCGT 



GCTCGGG ATAGTTCCG ACCTAGG ATTGGATGC ATGCGGAACCGC ACg AGGGCGGGGCGGA 
+ + + + + + 

CGAGCCCTATCAAGGC TGG ATCCTAACCTACGTACGCC TTGGCGTGcTCCCGCCCCGCCT 



AATTG AC ACACC ACTCCTC TCC ACGC AgCCGTTC AAGAGGTACGCGTATAG AGCCGTATA 
+ + _; + + + + 

TTAACTGTGTGGTGAGGAGAGG TGCGTcGGC AAGTTCTCC ATGCGC ATATCTCGGC AT AT 



G AGC AGAGACGGAGCACTTTC TGGTACTGTCCGC ACGGG ATGTCCGC ACGGAG AGCC AC A 

+ + H + + + 

CTCGTCTCTGCCTCGTGAAAGACCATGAC AGGCGTGCCCTACAGGCGTGCCTCTCGGTGT 



AACGAGCGGGGCCCCGTACGTGCTCTCCTACCCCAGGATCGC ATCCTCGC ATAGCTGAAC 
+ + + + + + 

TTGCTCGCCCCGGGGC ATGC ACGAGAGGATGGGGT CCT AGCGT AGG AGCGTATCG ACTTG 



ATCTATATAAAGACCCCC AAGGTTCTC AGTCTC ACC AAC ATCATC AACC AAC AATC AAC A 
+ + + + + -+ 



FIG. 11 sheet 2 
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TAG ATAT ATTTCTGGGGGTTCC AAGAGTC AGAGTGGTTGTAGTAGTTGGTTGTTAGTTGT 

Sal I 68 



1921 



1 L 



GGG TCG AC ATGTACCGG TTC C TCGCCGT CAT C TCGGCCTTC CTCGCC ACT GCCTTCG CCA 

+ + - + + + + 

CCC AGCTGTAC ATGGCC AAGG AGCGGC AGT AG AGC C GG A AG G AG CGGTG AC GGAAGCGGT 

Me tTyr ArgPheLeuAlaVal lie Ser AlaPhe Leu Ala Thr AlaPhe AlaLys 

Xbal BamHl/Bglll fusion 6 ^ 

AG*TCT AGAGGATC TCGATTC AGC TGC AAGTC AAGCTGCTCTGTGGGCTGTGATCTGCCTC 

+ + + + + + 

TC AGATCTCCTAGAGCTAAGTCGACGTTC AGTTCGACGAGAC ACCCG AC ACTAGACGG AG 

Ser ArgGly Ser ArgPheSer CysLy s Ser SerCy sSe rValGlyCys AspLeuProGln 



AAACCC AC AGC C TGGG TAGC A GGAGG AC CTTGATGCTCC TGGC AC AG ATGAGG AG AATCT 

+ + + + + + 

TTTGGGTGTCGGACCC ATCGTCCTCC TGG AACTACGAGGACCGTGTCTACTCCTCTTAGA 

Thr His Ser LeuGly Ser ArgArgThrLeuMe tLeuLeuAlaGlnMe t ArgArglle Se r 



C TCTTTTCTCCTGCTTGAAGGACAGAC ATG ACTTTGGATTTCCCC AGGAGGAGTTTGGC A 

+ + + + + + 

GAGAAAAGAG6ACGAACTTCCTGTCTGTACTGAAACCTAAAG6GGTCCTCCTCAAACCGT 

LeuPhe Ser CysLeuLy s Asp ArgHi s AspPheGly Phe Pr oGlnGluGluPheGlyAsn 



ACCAGTTCC AAAAGGCTGAAACC ATCCCTGTCCTCC ATGAGATGATCCAGC AGATCTTCA 

+ + + + + + 

TGGTC AAGGTTTTCCGAC TTTGGTAGGGAC AGG AGGTAC TCTACT AGGTCGTCT AGAAGT 

2220 

GlnPheGlnLysAlaGluThrlleProValLeuHisGluMetlleGlnGlnllePheAsn 
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2221 

ATCTCTTCAGC AC AAAGGACTCATC TGCTGCTTGGG ATG AGACCCTCCTAGACAAATTCT 
TAGAGAAGTCGTGTTTCCTGAGTAGACGACG AACCCTACTCTGGG AGGATCTGTTTAAGA 



LeuPheSerThrLys Asp Se rSer AlaAlaTrp AspGluThrLeuLeu AspLy s Phe Ty r 



AC AC TGAACTCTACCAGC AGCTG A AT G AC CTGGAAGCC TGT GTG AT AC AGGGGGTGGGGG 

+ + + + + + 

TGTGACTTGAGATGGTCGTCGACTTACTGG ACCTTCGGACACACTATGTCCCCCACCCCC 

ThrGluLeuTy rGlnGlnLeuAsnAspLeuGluAlaCy s VallleGlnGly ValGly Val 



TGAC AGAGACTCCCCTGATGAAGGAGG ACTCC ATTCTGGCTGTGAGG AAATACTTCCAAA 
+ + + + + + 

ACTGTCTCTG AGGGG ACT AC TTCCTCCTGAGG TAAG AC CG AC AC TCCTTT ATG AAGGTTT 
ThrGluThr ProLeuMe t Ly s Glu Asp Se rile Leu Ala Val ArgLys Ty r Phe Gin Ar g 



G AATCACTCTCTATCTGAAAGAGAAGAAATAC AGCCCTTGTGCCTGGGAGGTTGTCAGAG / 

+ + + + + + / 

CTTAGTGAGAGAT AGACTTTCTCTTCTTTATGTCGGGAAC ACGGACCC TCC AAC AGTCTC / 

/ 

He Thr LeuTyrLeuLy s GluLy s Ly s Ty r Se r Pro Cy s AlaTrpGlu Val Val Arg Ala 

I 



60 



C AGAAATCATGAG ATCTTTTTCTTTGTCAACAAACTTGC AAGAAAGTTTAAGAAGTAAGG \ 
+ + + + + + \ 

GTCTTTAGTACTCTAGAAAAAGAAAC AGTTGTTTG AACGTTCTTTC AAATTCTTC ATTCC \ **% 

2520 

GluIleMe t ArgSer Phe SerLeu SerThr AsnLeuGlnGlu SerLeuArg Ser Lys61u\ 
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2521 

AATGAAAACTGGTTCAACATGGAAATGATTTTCATTGATTCGTATGCCAGCTC ACCTTTT 
TTACTTTTG ACC AAGTTGTACCTTTACT AAAAGTAACT AAGC ATACGGTCGAGTGG AAAA 
End 



TATGATCTGCC ATTTCAAAGACTCATGTTTCTGCTATGACCATG AC ACG ATTTAAATCTT 



+ 



ATACTAGACGGT AAAGTTTCTGAGTACAAAGACG ATACTGGTACTGTGCTAAATTTAGAA 



TTC AAATGTTTTTAGGAGTATTAATC AACATTGTATTCAGCTC TTAAGGC ACTAGTCCCT 
AAGTTTAC AAAAATCCTC ATAATTAGTTGTAACATAAGTCG AGAATTCCGTGATC AGGGA 



TACAGAGGACC ATGCTGACTGATCC ATTATCTATTTAAATATTTTTAAAATATTATTTAT 
ATGTCTCCTGGTACGACTGACTAGGTAATAGATAAATTTATAAAAATTTTATAATAAATA 



TTAACTATTTAT AAAACAACTTATTTTTGTTC ATATTATGTC ATGTGC ACC TTTGCAC AG 
+ + + + + + 

AATTGAT AAAT ATTTTGTTGAATAAAAAC AAGTATAATAC AGTACACGTGG AAACGTGTC 



TGGTTAATGTAATAAAAT ATGTTCTTTGTATTTGGTAAAAAAAAAAAAAAAAAAAAAAAA 
+ + + + + + 

ACC AATTACATTATTTTATAC AAGAAAC ATAAACC ATTTTTTTTTTTTTTTTTTTTTTTT 

EcoRI 

AAA AA AAAA A A ACC G G AT C AT C G AG C TCGAATTcT 

+ + + 

TTTTTTTTTTTTGGCCTAGTAGCTCG AGCTTAAG 

29U 



FIG. 11 sheet 5 



WO 86/06097 



PCT/GB86/00209 




BamHl/Bgi II fusion 



It- 30 



BamHl/ Bgl II fusion 
EcoRI 



FIG. 12 



EcoRl 




Hind 111 



EcoRI 



■ r 



WO86/0«097 PCT/GB86/00209 

9.1 



EcoRI 

GAATTC AAGCTAGATGCTAAGCGATATTGC ATGGC AATATGTGTTG ATGCATGTGCTTCT 
CTTAAGTTCGATCT ACGATTCGCTATAACGTAC CGTTATACAC AACTACGTACACGAAGA 

TCCTTCAGCTTCCCCTCGTGCAGATGAAGGTTTGGCTATAAATTGAAGTGGTTGGTCGGG 
+ + + + + + 

AGG AAGTCG AAGGGG AGCACG TCTACTTCC AAACCG ATATTTAACTTCACC AACC AGCCC 



GTTCCGTGAGGGGCTGAAGTGC TTCCTCCCTTTTAGACGC AAC TGAGAGCCTGAGCTTCA 
CAAGGC ACTCCCCGACTTC ACGAAGGAGGGAAAATCTGCGTTGACTC TCGGACTCGAAGT 



TCCCC AGCATCATTACACCTCAGCAATGTCGTTCCGATCTCTACTCGCCCTGAGCGGCCT 
AGGGGTCGTAGT AATGTGG AGTCGTTAC AGC AAGGC TAGAG ATG AGCGGGACTCGCCGG A 

240 

MetSerPheArgSerLeuLeuAlaLeuSerGlyLeu 

Bom HI^Bg lllfusion 

2A1 A 2 f ' ^ /30 EcoRI 

- £ x / ( 

cgtctgcacagggttggcaaatgtgatttccaagcgc'gcaaagaTccag." .Igaattc 

GCAGACGTGTCCCAACCGTTTAC ACTAAAGGTTCGCGCGTTTCTAGGTC . . . CTTAAG 
ValCysThrGlyLeuAlaAsnVallleSerLysArgAlaLysIleGln 

FIG.13 



WO 86/06097 PCT/GB86/00209 

1381 

C TCGGTGGC AGATCTGCG ATCGTCC ATAACCGTTC AGATGTTGATTGGAACTGGGTGGGG 
+ + + + + + 

G AGCC ACCGTC TAGACGC TAGC AGGT ATTGGC AAGTCT ACAACTAACCTTGACCC ACCCC 



TAG AC AGCTCCGAAG ACCGAGTGAACGTAT ACCTAAGACACTTTGACACGGCCGGAAC AC 
ATCTGTCG AGGC TTCTGGC TC ACTTGC ATATGG ATTCTGTGAAACTGTGCCGGCC TTGTG 



TGTAAGTCCCTTCGTATTTCTCCGCCTGTGTGGAGCTACCATCCAATAACCCCCAGCTGA 
+ + + + + + 

ACATTCAGGG AAGC ATAAAGAGGCGGAC AC ACCTCGATGGTAGGTTATTGGGGGTCGACT 



AAAAGC TGATTGTCGATAGTTGTGATAGTTCCCAC TTGTCCGTCCGC ATCGGC ATCCGC A 

+ + + + ; + + 

TTTTCG AC TAACAGC TATC AAC AC TAT CAAGGGTGAAC AGGC AGGC GTAGCCGTAGGCG T 



GCTCGGG ATAGTTCCG ACCT AGGATTGGATGC ATGCGGAACCGCACg AGGGCGGGGCGGA 
+ + + + + + 

CG AGCC C TATC AAGGCTGGATCCT AAC CT AC GTACGCCTTGGC GTGc TCCCGCCCCGCCT 



AATTGAC AC ACCACTCCTCTCC ACGCAgCCG TTCAAGAGGTACGCGTATAGAGCCGTATA 
+ + + + + + 

TTAACTGTGTGGTGAGGAGAGGTGCGTcGGCAAGTTCTCCATGCGC ATATCTCGGC AT AT 

1740 
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G AGC AGAGACGGAGCACTTTCTGGTACTGTCCGCACGGGATGTCCGC ACGG AGAGCC AC A 
+ + + + + + 

CTCGTCTCTGCCTCGTGAAAGACC ATGAC AGGCGTGCCCTAC AGGCGTGCC TCTCGGTGT 



AACGAGCGGGGCCCCGTACGTGCTCTCCTACCCC AGGATCGC ATCCTCGC ATAGCTGAAC 
+ + +-- — + + + 

TTGCTCGCCCCGGGGC ATGC ACGAG AGG ATGGGGTCCTAGCGTAGG AGCGTATCGAC TTG 
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